Algorithms for Learning Markov Field Policies - Robotics Institute Carnegie Mellon University

Algorithms for Learning Markov Field Policies

Abdeslam Boularias, Oliver Kroemer, and Jan Peters
Conference Paper, Proceedings of (NeurIPS) Neural Information Processing Systems, Vol. 2, pp. 2177 - 2185, December, 2012

Abstract

We use a graphical model for representing policies in Markov Decision Processes. This new representation can easily incorporate domain knowledge in the form of a state similarity graph that loosely indicates which states are supposed to have similar optimal actions. A bias is then introduced into the policy search process by sampling policies from a distribution that assigns high probabilities to policies that agree with the provided state similarity graph, i.e. smoother policies. This distribution corresponds to a Markov Random Field. We also present forward and inverse reinforcement learning algorithms for learning such policy distributions. We illustrate the advantage of the proposed approach on two problems: cart-balancing with swing-up, and teaching a robot to grasp unknown objects.

BibTeX

@conference{Boularias-2012-112191,
author = {Abdeslam Boularias and Oliver Kroemer and Jan Peters},
title = {Algorithms for Learning Markov Field Policies},
booktitle = {Proceedings of (NeurIPS) Neural Information Processing Systems},
year = {2012},
month = {December},
volume = {2},
pages = {2177 - 2185},
}