Carnegie Mellon Robotics Institute
doctoral dissertation, tech. report CMU-RI-TR-02-04, Robotics Institute, Carnegie Mellon University, February, 2002
|This thesis addresses the problem of learning probabilistic representations of dynamical systems with non-linear dynamics and hidden state in the form of partially observable Markov decision process (POMDP) models, with the explicit purpose of using these models for robot control. In contrast to the usual approach to learning probabilistic models, which is based on iterative adjustment of probabilities so as to improve the likelihood of the observed data, the algorithms proposed in this thesis take a different approach - they reduce the learning problem to that of state aggregation by clustering in an embedding space of delayed coordinates, and subsequently estimating transition probabilities between aggregated states (clusters). This approach has close ties to the dominant methods for system identication in the field of control engineering, although the characteristics of POMDP models require very different algorithmic solutions.
Apart from an extensive investigation of the performance of the proposed algorithms in simulation, they are also applied to two robots built in the course of our experiments. The first one is a differential-drive mobile robot with a minimal number of proximity sensors, which has to perform the well-known robotic task of self-localization along the perimeter of its workspace. In comparison to previous neural-net based approaches to the same problem, our algorithm achieved much higher spatial accuracy of localization. The other task is visual servo-control of an under-actuated arm which has to rotate a flying ball attached to it so as to maintain maximal height of rotation with minimal energy expenditure. Even though this problem is intractable for known control engineering methods due to its strongly non-linear dynamics and partially observable state, a control policy obtained by means of policy iteration on a POMDP model learned by our state-aggregation algorithm performed better than several alternative open-loop and closed-loop controllers.
Associated Center(s) / Consortia:
Vision and Autonomous Systems Center
|Daniel Nikovski, "State-Aggregation Algorithms for Learning Probabilistic Models for Robot Control," doctoral dissertation, tech. report CMU-RI-TR-02-04, Robotics Institute, Carnegie Mellon University, February, 2002|
author = "Daniel Nikovski",
title = "State-Aggregation Algorithms for Learning Probabilistic Models for Robot Control",
booktitle = "",
school = "Robotics Institute, Carnegie Mellon University",
month = "February",
year = "2002",
address= "Pittsburgh, PA",
|The Robotics Institute is part of the School of Computer Science, Carnegie Mellon University.|
Contact Us | Update Instructions