Reinforcement Learning in Robotics: A Survey

J. Kober, J. Andrew (Drew) Bagnell and J. Peters
Journal Article, Carnegie Mellon University, International Journal of Robotics Research, July, 2013

View Publication

Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.


Reinforcement learning offers to robotics a framework and set of tools for the design of sophisticated and hard-to-engineer behaviors. Conversely, the challenges of robotic problems provide both inspiration, impact, and validation for developments in reinforcement learning. The relationship between disciplines has sufficient promise to be likened to that between physics and mathematics. In this article, we attempt to strengthen the links between the two research communities by providing a survey of work in reinforcement learning for behavior generation in robots. We highlight both key challenges in robot reinforcement learning as well as notable successes. We discuss how contributions tamed the complexity of the domain and study the role of algorithms, representations, and prior knowledge in achieving these successes. As a result, a particular focus of our paper lies on the choice between modelbased and model-free as well as between value function-based and policy search methods. By analyzing a simple problem in some detail we demonstrate how reinforcement learning approaches may be profitably applied, and we note throughout open questions and the tremendous potential for future research.


author = {J. Kober and J. Andrew (Drew) Bagnell and J. Peters},
title = {Reinforcement Learning in Robotics: A Survey},
journal = {International Journal of Robotics Research},
year = {2013},
month = {July},
keywords = {reinforcement learning, learning control, robot, survey},
} 2017-09-13T10:39:18-04:00