Online Learning to Approach a Person With No Regret
Ahn, Hyemin(Seoul National University)
United States | IEEE Robotics and Automation Letters
2017-07-20 | 바로가기
Robot_kinematics, Service_robots, Face
Cited by 5
■ View full text
IEEE Robotics and Automation Letters
Date of Publication: 20 July 2017
Hyemin Ahn1, Yoonseon Oh1, Sungjoon Choi1, Claire J. Tomlin2, Songhwai Oh1
1 Department of Electrical and Computer Engineering and the Automation and Systems Research Institute, Seoul National University
2 Department of Electrical Engineering and Computer Sciences, University of California, Berkeley
Each person has a different personal space and behaves differently when another person approaches. Based on this observation, we propose a novel method to learn how to approach a person comfortably based on the person's preference while avoiding uncomfortable encounters. We propose a personal comfort field to learn each person's preference about an approaching object. A personal comfort field is based on existing theories in anthropology and personalized for each user through repeated encounters. We propose an online method to learn a personal comfort field of a user, i.e., personalized learning, based on the concept from the Gaussian process upper confidence bound and show that the proposed method has no regret asymptotically. The effectiveness of the proposed method has been extensively validated in simulation and real-world experiments. Results show that the proposed method can gradually learn the personalized approaching behavior preferred by the user as the number of encounters increases.
In this letter, we have presented a method for generating an approaching trajectory based on each user's personal comfort field. The proposed method models the personal space as a personal comfort field, which consists of the general and personalized comfort fields. The general comfort field models the personal space of the general population while the personalized comfort field models each user's specific comfort field. In order to learn the personal comfort field, the proposed method employs user's nonverbal cues, namely, changing positions and orientations, which can be observed from the robot. The proposed method allows a robot to improve its behavior as it encounters the user more while avoiding uncomfortable approaches. We have also shown that the proposed online learning algorithm has no-regret, a desirable behavior for an online algorithm. The simulation and experimental results show quantitatively that the proposed method can make the robot learn each user's personal comfort field through repeated encounters. While the scale of the user study is small, it suggests that there is a significant improvement in user satisfaction about the quality of an approaching trajectory of the robot.
* 관련 자료