Complexity Analysis of Real-Time Reinforcement Learning

Sven Koenig and Reid Simmons
Conference Paper, Proceedings of the Eleventh National Conference on Artificial Intelligence (AAAI), pp. 99 - 105, January, 1993

View Publication

Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.


This paper analyzes the complexity of on-line reinforcement learning algorithms, namely asynchronous real-time versions of Q-learning and value-iteration, applied to the problem of reaching a goal state in deterministic domains. Previous work had concluded that, in many cases, tabula rasa reinforcement learning was exponential for such problems, or was tractable only if the learning algorithm was augmented. We show that, to the contrary, the algorithms are tractable with only a simple change in the task representation or initialization. We provide tight bounds on the worst-case complexity, and show how the complexity is even smaller if the reinforcement learning algorithms have initial knowledge of the topology of the state space or the domain has certain special properties. We also present a novel bi-directional Q-learning algorithm to find optimal paths from all states to a goal state and show that it is no more complex than the other algorithms.

author = {Sven Koenig and Reid Simmons},
title = {Complexity Analysis of Real-Time Reinforcement Learning},
booktitle = {Proceedings of the Eleventh National Conference on Artificial Intelligence (AAAI)},
year = {1993},
month = {January},
pages = {99 - 105},
} 2017-09-13T10:46:46-04:00