Finding Approximate POMDP solutions Through Belief Compression

Nicholas Roy
doctoral dissertation, tech. report CMU-RI-TR-03-25, Robotics Institute, Carnegie Mellon University, October, 2003


Download
  • Adobe portable document format (pdf) (2MB)
Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.

Abstract
Recent research in the field of robotics has demonstrated the utility of probabilistic models for perception and state tracking on deployed robot systems. For example, Kalman filters and Markov localisation have been used successfully in many robot applications (Leonard & Durrant-Whye, 1991; Thrun et al., 2000). There has also been considerable research into control and decision making algorithms that are robust in the face of specific kinds of uncertainty (Bagnell & Schneider, 2001). Few control algorithms, however, make use of full probabilistic representations throughout planning. As a consequence, robot control can become increasingly brittle as the system's perceptual uncertainty, and state uncertainty, increase.

This thesis addresses the problem of decision making under uncertainty. In particular, we use a planning model called the partially observable Markov decision process, or POMDP (Sondik 1971). The POMDP model computes a policy that maximises the expected future reward based on the complete probabilistic state estimate, or belief. Unfortunately, finding an optimal policy exactly is computationally demanding and thus infeasible for most problems that represent real world scenarios.

This thesis describes a scalable approach to POMDP planning which uses low-dimensional representations of the belief space. We demonstrate how to make use of a variant of Principal Components Analysis (PCA) called Exponential family PCA (Collins et al., 2002) in order to compress certain kinds of large real-world POMDPs, and find policies for these problems. By finding low-dimensional representations of POMDPS, we are able to find good policies for problems that are orders of magnitude larger than those solvable by conventional approaches.


Keywords
POMDPs, Robotics, Planning, Machine Learning

Notes
Associated Lab(s) / Group(s): Robot Learning Lab

Text Reference
Nicholas Roy, "Finding Approximate POMDP solutions Through Belief Compression," doctoral dissertation, tech. report CMU-RI-TR-03-25, Robotics Institute, Carnegie Mellon University, October, 2003

BibTeX Reference
@phdthesis{Roy_2003_4488,
   author = "Nicholas Roy",
   title = "Finding Approximate POMDP solutions Through Belief Compression",
   booktitle = "",
   school = "Robotics Institute, Carnegie Mellon University",
   month = "October",
   year = "2003",
   number= "CMU-RI-TR-03-25",
   address= "Pittsburgh, PA",
}