(주)노바테크 로봇 엔지니어 경력/신입 채용(울산/부산)
MERRIC인
Inferring Task Goals and Constraints using Bayesian Nonparametric Inverse Reinforcement Learning
Daehyung Park(Healthcare Robotics Lab, Institute for Robotics an)
USA | Proceedings of the Conference on Robot Learning

■  View full text

Proceedings of the Conference on Robot Learning, PMLR 100:1005-1014, 2020.

http://proceedings.mlr.press/v100/park20a.html

 

■ Researchers

Daehyung Park, Michael Noseworthy, Rohan Paul, Subhro Roy, Nicholas Roy ; 

Healthcare Robotics Lab, Institute for Robotics and Intelligent Machines, Georgia Institute of Technology, Atlanta, GA, USA

 

■ Abstract

Recovering an unknown reward function for complex manipulation tasks is the fundamental problem of Inverse Reinforcement Learning (IRL). Often, the recovered reward function fails to explicitly capture implicit constraints (e.g., axis alignment, force, or relative alignment) between the manipulator, the objects of interaction, and other entities in the workspace. The standard IRL approaches do not model the presence of locally-consistent constraints that may be active only in a section of a demonstration. This work introduces Constraint-based Bayesian Nonparametric Inverse Reinforcement Learning (CBN-IRL) that models the observed behaviour as a sequence of subtasks, each consisting of a goal and a set of locally-active constraints. CBN-IRL infers locally-active constraints given a single demonstration by identifying potential constraints and their activation space. Further, the nonparametric prior over subgoals constituting the task allows the model to adapt with the complexity of the demonstration. The inferred set of goals and constraints are then used to recover a control policy via constrained optimization. We evaluate the proposed model in simulated navigation and manipulation domains. CBN-IRL efficiently learns a compact representation for complex tasks that allows generalization in novel environments, outperforming state-of-the-art IRL methods. Finally, we demonstrate the model on two tool-manipulation tasks using a UR5 manipulator and show generalization to novel test scenarios.

 

인쇄 Facebook Twitter 스크랩

  전체댓글 0

[로그인]

댓글 입력란
프로필 이미지
0/500자