Issues in Using Function Approximation for Reinforcement Learning - Robotics Institute Carnegie Mellon University

Issues in Using Function Approximation for Reinforcement Learning

Sebastian Thrun and A. Schwartz
Conference Paper, Proceedings of 4th Connectionist Models Summer School, June, 1993

Abstract

Reinforcement learning techniques address the problem of learning to select actions in unknown, dynamic environments. It is widely acknowledged that to be of use in complex domains, reinforcement learning techniques must be combined with generalizing function approximation methods such as artificial neural networks. Little, however, is understood about the theoretical properties of such combinations, and many researchers have encountered failures in practice. In this paper we identify a prime source of such failures—namely, a systematic overestimation of utility values. Using Watkins’Q-Learning [18] as an example, we give a theoretical account of the phenomenon, deriving conditions under which one may expected it to cause learning to fail. Employing some of the most popular function approximators, we present experimental results which support the theoretical findings.

BibTeX

@conference{Thrun-1993-15908,
author = {Sebastian Thrun and A. Schwartz},
title = {Issues in Using Function Approximation for Reinforcement Learning},
booktitle = {Proceedings of 4th Connectionist Models Summer School},
year = {1993},
month = {June},
editor = {M. Mozer, P. Smolensky, D. Touretzky, J. Elman, and A. Weigend},
publisher = {Erlbaum Associates},
}