Learning to Search: Structured Prediction Techniques for Imitation Learning

Nathan Ratliff
doctoral dissertation, tech. report CMU-RI-TR-09-19, Robotics Institute, Carnegie Mellon University, May, 2009


Download
  • Adobe portable document format (pdf) (8MB)
Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.

Abstract
Modern robots successfully manipulate objects, navigate rugged terrain, drive in urban set- tings, and play world-class chess. Unfortunately, programming these robots is challenging, time- consuming and expensive; the parameters governing their behavior are often unintuitive, even when the desired behavior is clear and easily demonstrated. Inspired by successful end-to-end learning systems such as neural network controlled driving platforms (Pomerleau, 1989), learning-based “programming by demonstration” has gained currency as a method to achieve intelligent robot behavior. Unfortunately, with highly structured algorithms at their core, modern robotic systems are hard to train using classical learning techniques. Rather than redefining robot architectures to accommodate existing learning algorithms, this thesis develops learning techniques that leverage the performance of modern robotic components.

We begin with a discussion of a novel imitation learning framework we call Maximum Margin Planning which automates finding a cost function for optimal planning and control algorithms such as A*. In the linear setting, this framework has firm theoretical backing in the form of strong generalization and regret bounds. Further, we have developed practical nonlinear generalizations that are effective and efficient for real-world problems. This framework reduces imitation learning to a modern form of machine learning known as Maximum Margin Structured Classification (Taskar et al., 2005); these algorithms, therefore, apply both specifically to training existing state-of-the-art planners as well as broadly to solving a range of structured prediction problems of importance in learning and robotics.

In difficult high-dimensional planning domains, such as those found in many manipulation problems, high-performance planning technology remains a topic of much research. We close with some recent work which moves toward simultaneously advancing this technology while retaining the learnability developed above.

Throughout the thesis, we demonstrate our algorithms on a range of applications including overhead navigation, quadrupedal locomotion, heuristic learning, manipulation planning, grasp prediction, driver prediction, pedestrian prediction, optical character recognition, and LADAR classification.

Notes
Number of pages: 187

Text Reference
Nathan Ratliff, "Learning to Search: Structured Prediction Techniques for Imitation Learning ," doctoral dissertation, tech. report CMU-RI-TR-09-19, Robotics Institute, Carnegie Mellon University, May, 2009

BibTeX Reference
@phdthesis{Ratliff_2009_6546,
   author = "Nathan Ratliff",
   title = "Learning to Search: Structured Prediction Techniques for Imitation Learning ",
   booktitle = "",
   school = "Robotics Institute, Carnegie Mellon University",
   month = "May",
   year = "2009",
   number= "CMU-RI-TR-09-19",
   address= "Pittsburgh, PA",
}