Exploiting Model Uncertainty Estimates for Safe Dynamic Control Learning

Conference Paper, Proceedings of (NeurIPS) Neural Information Processing Systems, pp. 1047 - 1053, December, 1996

View Publication

Abstract

Model learning combined with dynamic programming has been shown to be effective for learning control of continuous state dynamic systems. The simplest method assumes the learned model is correct and applies dynamic programming to it, but many approximators provide uncertainty estimates on the fit. How can they be exploited? This paper addresses the case where the system must be prevented from having catastrophic failures during learning. We propose a new algorithm adapted from the dual control literature and use Bayesian locally weighted regression models with dynamic programming. A common reinforcement learning assumption is that aggressive exploration should be encouraged. This paper addresses the converse case in which the system has to reign in exploration. The algorithm is illustrated on a 4 dimensional simulated control problem.

BibTeX

@conference{Schneider-1996-16253,
author = {Jeff Schneider},
title = {Exploiting Model Uncertainty Estimates for Safe Dynamic Control Learning},
booktitle = {Proceedings of (NeurIPS) Neural Information Processing Systems},
year = {1996},
month = {December},
pages = {1047 - 1053},
}

Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.