Applying Online Search Techniques to Reinforcement Learning

Scott Davies, A. Y. Ng, and Andrew Moore

Conference Paper, Proceedings of 15th National Conference on Artificial Intelligence (AAAI '98), pp. 753 - 760, July, 1998

View Publication

Abstract

In this paper, we describe methods for efficiently computing better solutions to control problems in continuous state spaces. We provide algorithms that exploit online search to boost the power of very approximate value functions discovered by traditional reinforcement learning techniques. We examine local searches, where the agent performs a finite-depth lookahead search, and global searches, where the agent performs a search for a trajectory all the way from the current state to a goal state. The key to the success of the local methods lies in taking a value function, which gives a rough solution to the hard problem of finding good trajectories from every single state, and combining that with online search, which then gives an accurate solution to the easier problem of finding a good trajectory specifically from the current state. The key to the success of the global methods lies in using aggressive state-space search techniques such as uniform-cost search and A*, tamed into a tractable form by exploiting neighborhood relations and trajectory constraints that arise from continuous-space dynamic control.

BibTeX

@conference{Davies-1998-16554,
author = {Scott Davies and A. Y. Ng and Andrew Moore},
title = {Applying Online Search Techniques to Reinforcement Learning},
booktitle = {Proceedings of 15th National Conference on Artificial Intelligence (AAAI '98)},
year = {1998},
month = {July},
pages = {753 - 760},
}

Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.