Reinforcement Planning: RL for Optimal Planners

Matthew Zucker and J. Andrew (Drew) Bagnell
tech. report CMU-RI-TR-10-14, Robotics Institute, Carnegie Mellon University, May, 2010


Download
  • Adobe portable document format (pdf) (3MB)
Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.

Abstract
Search based planners such as A* and Dijkstra's algorithm are proven methods for guiding today's robotic systems. Although such planners are typically based upon a coarse approximation of reality, they are nonetheless valuable due to their ability to reason about the future, and to generalize to previously unseen scenarios. However, encoding the desired behavior of a system into the underlying cost function used by the planner can be a tedious and error-prone task. We introduce Reinforcement Planning, which extends gradient based reinforcement learning algorithms to automatically learn useful cost functions for optimal planners. Reinforcement Planning presents several advantages over other learning applications involving planners in that it is not limited by the expertise of a human demonstrator, and that it also recognizes that the domain of the planner is a simplified model of the world. We demonstrate the effectiveness of our method in learning to solve a noisy physical simulation of the well-known "marble maze" toy.

Notes
Associated Center(s) / Consortia: Center for the Foundations of Robotics
Associated Lab(s) / Group(s): Planning and Autonomy Lab
Associated Project(s): Learning Locomotion

Text Reference
Matthew Zucker and J. Andrew (Drew) Bagnell, "Reinforcement Planning: RL for Optimal Planners," tech. report CMU-RI-TR-10-14, Robotics Institute, Carnegie Mellon University, May, 2010

BibTeX Reference
@techreport{Zucker_2010_6573,
   author = "Matthew Zucker and J. Andrew (Drew) Bagnell",
   title = "Reinforcement Planning: RL for Optimal Planners",
   booktitle = "",
   institution = "Robotics Institute",
   month = "May",
   year = "2010",
   number= "CMU-RI-TR-10-14",
   address= "Pittsburgh, PA",
}