Q-Learning in Continuous State and Action Spaces

C. Gaskett, D. Wettergreen, and A. Zelinsky

Conference Paper, Proceedings of 12th Australasian Joint Conference on Artificial Intelligence (AI '99), pp. 417 - 428, December, 1999

Abstract

Q-learning can be used to learn a control policy that maximises a scalar reward through interaction with the environment. Q-learning is commonly applied to problems with discrete states and actions. We describe a method suitable for control tasks which require continuous actions, in response to continuous states. The system consists of a neural network coupled with a novel interpolator. Simulation results are presented for a non-holonomic control task. Advantage Learning, a variation of Q-learning, is shown enhance learning speed and reliability for this task.

BibTeX

@conference{Gaskett-1999-120376,
author = {C. Gaskett and D. Wettergreen and A. Zelinsky},
title = {Q-Learning in Continuous State and Action Spaces},
booktitle = {Proceedings of 12th Australasian Joint Conference on Artificial Intelligence (AI '99)},
year = {1999},
month = {December},
pages = {417 - 428},
}

Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.