Maximum Margin Planning

Nathan Ratliff, J. Andrew (Drew) Bagnell, and Martin Zinkevich
International Conference on Machine Learning, July, 2006.

  • Adobe portable document format (pdf) (2MB)
Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.

Imitation learning of sequential, goal-directed behavior by standard supervised techniques is often difficult. We frame learning such behaviors as a maximum margin structured prediction problem over a space of policies. In this approach, we learn mappings from features to cost so an optimal policy in an MDP with these cost mimics the expert's behavior.

Further, we demonstrate a simple, provably efficient approach to structured maximum margin learning, based on the subgradient method, that leverages existing fast algorithms for inference. Although the technique is general, it is particularly relevant in problems where A* and dynamic programming approaches make learning policies tractable in problems beyond the limitations of a QP formulation. We demonstrate our approach applied to route planning for outdoor mobile robots, where the behavior a designer wishes a planner to execute is often clear, while specifying cost functions that engender this behavior is a much more difficult task.

Path Planning, Structured Classification, Maximum Margin, Online Learning, Convex Programming

Associated Center(s) / Consortia: Center for the Foundations of Robotics
Associated Lab(s) / Group(s): Planning and Autonomy Lab
Associated Project(s): Learning Locomotion and PeepPredict
Number of pages: 8

Text Reference
Nathan Ratliff, J. Andrew (Drew) Bagnell, and Martin Zinkevich, "Maximum Margin Planning," International Conference on Machine Learning, July, 2006.

BibTeX Reference
   author = "Nathan Ratliff and J. Andrew (Drew) Bagnell and Martin Zinkevich",
   title = "Maximum Margin Planning",
   booktitle = "International Conference on Machine Learning",
   month = "July",
   year = "2006",