Online Fitted Reinforcement Learning

Geoffrey Gordon
VFA workshop at ML-95, 1995.


Download
  • Adobe portable document format (pdf) (88KB)
Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.

Abstract
My paper in the main portion of the conference deals with fitted value iteration or Q-learning for offline problems, {\em i.e.}, those where we have a model of the environment so that we can examine arbitrary transitions in arbitrary order. The same techniques also allow us to do Q-learning for an online problem, {\em i.e.}, one where we have no model but must instead perform experiments inside the MDP to gather data. I will describe how.

Notes
Associated Lab(s) / Group(s): Auton Lab
Associated Project(s): Auton Project

Text Reference
Geoffrey Gordon, "Online Fitted Reinforcement Learning," VFA workshop at ML-95, 1995.

BibTeX Reference
@inproceedings{Gordon_1995_2892,
   author = "Geoffrey Gordon",
   title = "Online Fitted Reinforcement Learning",
   booktitle = "VFA workshop at ML-95",
   year = "1995",
}