Graphics enhanced version of this site
State-Aggregation Algorithms for Learning
Probabilistic Models for Robot Control
D. Nikovski
doctoral dissertation, tech. report CMU-RI-TR-02-04, Robotics Institute, Carnegie Mellon University, February, 2002.
Jump to: Download | Abstract | Notes | Text Reference | BibTeX Reference
Adobe portable document format (pdf) [984 KB]
Compressed postscript (ps.gz) [702 KB]
Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.
This thesis addresses the problem of learning probabilistic representations of dynamical systems with non-linear dynamics and hidden state in the form of partially observable Markov decision process (POMDP) models, with the explicit purpose of using these models for robot control. In contrast to the usual approach to learning probabilistic models, which is based on iterative adjustment of probabilities so as to improve the likelihood of the observed data, the algorithms proposed in this thesis take a different approach - they reduce the learning problem to that of state aggregation by clustering in an embedding space of delayed coordinates, and subsequently estimating transition probabilities between aggregated states (clusters). This approach has close ties to the dominant methods for system identication in the field of control engineering, although the characteristics of POMDP models require very different algorithmic solutions.
Apart from an extensive investigation of the performance of the proposed algorithms in simulation, they are also applied to two robots built in the course of our experiments. The first one is a differential-drive mobile robot with a minimal number of proximity sensors, which has to perform the well-known robotic task of self-localization along the perimeter of its workspace. In comparison to previous neural-net based approaches to the same problem, our algorithm achieved much higher spatial accuracy of localization. The other task is visual servo-control of an under-actuated arm which has to rotate a flying ball attached to it so as to maintain maximal height of rotation with minimal energy expenditure. Even though this problem is intractable for known control engineering methods due to its strongly non-linear dynamics and partially observable state, a control policy obtained by means of policy iteration on a POMDP model learned by our state-aggregation algorithm performed better than several alternative open-loop and closed-loop controllers.
Associated center: VASC
D. Nikovski, State-Aggregation Algorithms for Learning Probabilistic Models for Robot Control, doctoral dissertation, tech. report CMU-RI-TR-02-04, Robotics Institute, Carnegie Mellon University, February, 2002.
@phdthesis{Nikovski_2002_3905,
author = "Daniel Nikovski",
title = "State-Aggregation Algorithms for Learning
Probabilistic Models for Robot Control",
school = "Robotics Institute, Carnegie Mellon University",
month = "February",
year = "2002",
address = "Pittsburgh, PA"
}