An integrated approach to hierarchy and abstraction for POMDPs - Robotics Institute Carnegie Mellon University

An integrated approach to hierarchy and abstraction for POMDPs

Joelle Pineau and Sebastian Thrun
Tech. Report, CMU-RI-TR-02-21, Robotics Institute, Carnegie Mellon University, August, 2002

Abstract

This paper presents an algorithm for planning in structured partially observable Markov Decision Processes (POMDPs). The new algorithm, named PolCA (for Policy Contingent Abstraction), uses an action-based decomposition to partition complex POMDP problems into a hierarchy of smaller subproblems. Low-level subtasks are solved first, and their partial policies are used to model abstract actions in the context of higher-level subtasks. At all levels of the hierarchy, subtasks need only consider a reduced action, state and observation space. The reduced action set is provided by a designer, whereas the reduced state and observations sets are discovered automatically on a subtask-per-subtask basis. This typically results in lower-level subtasks having few, but high-resolution, state/observations features, whereas high-level subtasks tend to have many, but low-resolution, state/observation features. This paper presents a detailed overview of PolCA in the context of a POMDP hierarchical planning and execution algorithm. It also includes theoretical results demonstrating that in the special case of fully observable MDPs, the algorithm converges to a recursively optimal solution. Experimental results included in the paper demonstrate the usefulness of the approach on a range of problems, and show favorable performance compared to competing function-approximation POMDP algorithms. Finally, the paper presents a real-world implementation and deployment of a robotic system which uses PolCA in the context of a high-level robot behavior control task.

BibTeX

@techreport{Pineau-2002-8519,
author = {Joelle Pineau and Sebastian Thrun},
title = {An integrated approach to hierarchy and abstraction for POMDPs},
year = {2002},
month = {August},
institute = {Carnegie Mellon University},
address = {Pittsburgh, PA},
number = {CMU-RI-TR-02-21},
keywords = {Markov decision process, POMDP, reinforcement learning, hierarchical planning, abstraction, robot control, dialogue systems},
}