Carnegie Mellon Robotics Institute
J. Andrew (Drew) Bagnell and Jeff Schneider
Proceeding of the International Joint Conference on Artifical Intelligence, August, 2003.
| Download |
|
| Abstract |
| We investigate the problem of non-covariant behavior of policy gradient reinforcement learning algorithms. The policy gradient approach is amenable to analysis by information geometric methods. This leads us to propose a natural metric on controller parameterization that results from considering the manifold of probability distributions over paths induced by a stochastic controller. Investigation of this approach leads to a covariant gradient ascent rule. Interesting properties of this rule are discussed, including its relation with actor-critic style reinforcement learning algorithms. The algorithms discussed here are computationally quite efficient and on some interesting problems lead to dramatic performance improvement over non-covariant rules. |
| Keywords |
| Reinforcement Learning, Reinforce, Natural gradient, covariant |
| Notes |
Associated Lab(s) / Group(s):
Auton Lab Associated Project(s):
Auton Project |
| Text Reference |
| J. Andrew (Drew) Bagnell and Jeff Schneider, "Covariant Policy Search," Proceeding of the International Joint Conference on Artifical Intelligence, August, 2003. |
| BibTeX Reference |
|
@inproceedings{Bagnell_2003_4486, author = "J. Andrew (Drew) Bagnell and Jeff Schneider", title = "Covariant Policy Search", booktitle = "Proceeding of the International Joint Conference on Artifical Intelligence", month = "August", year = "2003", } |
| The Robotics Institute is part of the School of Computer Science, Carnegie Mellon University. Contact Us | Update Instructions |