Enabling Robots to Communicate their Objectives

Sandy H. Huang, David Held, Pieter Abbeel, and Anca D. Dragan

Conference Paper, Proceedings of Robotics: Science and Systems (RSS '17), July, 2017

View Publication

Abstract

Our ultimate goal is to efficiently enable end-users to correctly anticipate a robot's behavior in novel situations. This behavior is often a direct result of the robot's underlying objective function. Our insight is that end-users need to have an accurate mental model of this objective function in order to understand and predict what the robot will do. While people naturally develop such a mental model over time through observing the robot act, this familiarization process may be lengthy. Our approach reduces this time by having the robot model how people infer objectives from observed behavior, and then selecting those behaviors that are maximally informative. The problem of computing a posterior over objectives from observed behavior is known as Inverse Reinforcement Learning (IRL), and has been applied to robots learning human objectives. We consider the problem where the roles of human and robot are swapped. Our main contribution is to recognize that unlike robots, humans will not be \emph{exact} in their IRL inference. We thus introduce two factors to define candidate approximate-inference models for human learning in this setting, and analyze them in a user study in the autonomous driving domain. We show that certain approximate-inference models lead to the robot generating example behaviors that better enable users to anticipate what the robot will do in test situations. Our results also suggest, however, that additional research is needed in modeling how humans extrapolate from examples of robot behavior.

BibTeX

@conference{Held-2017-102827,
author = {Sandy H. Huang and David Held and Pieter Abbeel and Anca D. Dragan},
title = {Enabling Robots to Communicate their Objectives},
booktitle = {Proceedings of Robotics: Science and Systems (RSS '17)},
year = {2017},
month = {July},
}

Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.