Using Finite-Differences methods for approximating the value function of continuous Reinforcement Learning problems

Remi Munos
International Symposium on Multi-Technology Information Processing 1996, 1996.


Download
  • Adobe portable document format (pdf) (315KB)
Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.

Abstract
This paper presents a reinforcement learning method for solving continuous optimal control problems when the dynamics of the system is unknown. First, we use a Finite Differences method for discretizing the Hamilton-Jacobi-Bellman equation and obtain a finite Markovian Decision Process. This permits us to approximate the value function of the continuous problem with piecewise constant functions defined on a grid. Then we propose to solve this MDP on-line with the available knowledge using a direct and convergent reinforcement learning algorithm, called the Finite-Differences Reinforcement Learning

Notes
Associated Lab(s) / Group(s): Auton Lab
Associated Project(s): Auton Project
Number of pages: 6

Text Reference
Remi Munos, "Using Finite-Differences methods for approximating the value function of continuous Reinforcement Learning problems," International Symposium on Multi-Technology Information Processing 1996, 1996.

BibTeX Reference
@inproceedings{Munos_1996_2946,
   author = "Remi Munos",
   title = "Using Finite-Differences methods for approximating the value function of continuous Reinforcement Learning problems",
   booktitle = "International Symposium on Multi-Technology Information Processing 1996",
   year = "1996",
}