An integrated approach to hierarchy and abstraction for POMDPs

Joelle Pineau and Sebastian Thrun
tech. report CMU-RI-TR-02-21, Robotics Institute, Carnegie Mellon University, August, 2002


Download
  • Adobe portable document format (pdf) (578KB)
Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.

Abstract
This paper presents an algorithm for planning in structured partially observable Markov Decision Processes (POMDPs). The new algorithm, named PolCA (for Policy Contingent Abstraction), uses an action-based decomposition to partition complex POMDP problems into a hierarchy of smaller subproblems. Low-level subtasks are solved first, and their partial policies are used to model abstract actions in the context of higher-level subtasks. At all levels of the hierarchy, subtasks need only consider a reduced action, state and observation space. The reduced action set is provided by a designer, whereas the reduced state and observations sets are discovered automatically on a subtask-per-subtask basis. This typically results in lower-level subtasks having few, but high-resolution, state/observations features, whereas high-level subtasks tend to have many, but low-resolution, state/observation features. This paper presents a detailed overview of PolCA in the context of a POMDP hierarchical planning and execution algorithm. It also includes theoretical results demonstrating that in the special case of fully observable MDPs, the algorithm converges to a recursively optimal solution. Experimental results included in the paper demonstrate the usefulness of the approach on a range of problems, and show favorable performance compared to competing function-approximation POMDP algorithms. Finally, the paper presents a real-world implementation and deployment of a robotic system which uses PolCA in the context of a high-level robot behavior control task.

Keywords
Markov decision process, POMDP, reinforcement learning, hierarchical planning, abstraction, robot control, dialogue systems

Notes
Number of pages: 50

Text Reference
Joelle Pineau and Sebastian Thrun, "An integrated approach to hierarchy and abstraction for POMDPs," tech. report CMU-RI-TR-02-21, Robotics Institute, Carnegie Mellon University, August, 2002

BibTeX Reference
@techreport{Pineau_2002_4059,
   author = "Joelle Pineau and Sebastian Thrun",
   title = "An integrated approach to hierarchy and abstraction for POMDPs",
   booktitle = "CMU-RI-TR-02-21",
   institution = "Robotics Institute",
   month = "August",
   year = "2002",
   number= "CMU-RI-TR-02-21",
   address= "Pittsburgh, PA",
}