/Active Reward Learning

Active Reward Learning

Christian Daniel, Malte Viering, Jan Metz, Oliver Kroemer and Jan Peters
Conference Paper, Robotics: Science and Systems (RSS), January, 2014

Download Publication (PDF)

Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author’s copyright. These works may not be reposted without the explicit permission of the copyright holder.


While reward functions are an essential component of many robot learning methods, defining such functions remains a hard problem in many practical applications. For tasks such as grasping, there are no reliable success measures available. Defining reward functions by hand requires extensive task knowledge and often leads to undesired emergent behavior. Instead, we propose to learn the reward function through active learning, querying human expert knowledge for a subset of the agent’s rollouts. We introduce a framework, wherein a traditional learning algorithm interplays with the reward learning component, such that the evolution of the action learner guides the queries of the reward learner. We demonstrate results of our method on a robot grasping task and show that the learned reward function generalizes to a similar task.

BibTeX Reference
author = {Christian Daniel and Malte Viering and Jan Metz and Oliver Kroemer and Jan Peters},
title = {Active Reward Learning},
booktitle = {Robotics: Science and Systems (RSS)},
year = {2014},
month = {January},