A Convergent Reinforcement Learning algorithm in the continuous case: the Finite-Element Reinforcement Learning

Remi Munos

Conference Paper, Proceedings of (ICML) International Conference on Machine Learning, pp. 337 - 345, July, 1996

View Publication

Abstract

This paper presents a direct reinforcement learning algorithm, called Finite-Element Reinforcement Learning, in the continuous case, i.e. continuous state-space and time. The evaluation of the value function enables the generation of an optimal policy for reinforcement control problems, such as target or obstacle problems, viability problems or optimization problems. We propose a continuous formalism for the studying of reinforcement learning using the continuous optimal control framework, then we state the associated Hamilton-Jacobi-Bellman equation. First, we propose to approximate the value function by a numerical scheme based on a finite-element method. This generates a discrete Markov Decision Process, with finite state and control spaces, which can be solved by Dynamic Programming. The computation of this approximation scheme, in reinforcement learning terminology, belongs to the class of indirect learning methods. Then we present our direct learning algorithm which approximates the previous finite-element scheme and prove its convergence to the value function of the continuous problem.

BibTeX

@conference{Munos-1996-16322,
author = {Remi Munos},
title = {A Convergent Reinforcement Learning algorithm in the continuous case: the Finite-Element Reinforcement Learning},
booktitle = {Proceedings of (ICML) International Conference on Machine Learning},
year = {1996},
month = {July},
pages = {337 - 345},
}

Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.