Human Behavior Modeling with Maximum Entropy Inverse Optimal Control

Brian D. Ziebart, Andrew L. Maas, J. Andrew (Drew) Bagnell, and Anind Dey

Conference Paper, Proceedings of AAAI '09 Spring Symposium on Human Behavior Modeling, pp. 92 - 97, March, 2009

View Publication

Abstract

In our research, we view human behavior as a structured sequence of context-sensitive decisions. We develop a conditional probabilistic model for predicting human decisions given the contextual situation. Our approach employs the principle of maximum entropy within the Markov Decision Process framework. Modeling human behavior is reduced to recovering a context-sensitive utility function that explains demonstrated behavior within the probabilistic model. In this work, we review the development of our probabilistic model (Ziebart et al. 2008a) and the results of its application to modeling the context-sensitive route preferences of drivers (Ziebart et al. 2008b). We additionally expand the approach's applicability to domains with stochastic dynamics, present preliminary experiments on modeling time-usage, and discuss remaining challenges for applying our approach to other human behavior modeling problems.

BibTeX

@conference{Ziebart-2009-9932,
author = {Brian D. Ziebart and Andrew L. Maas and J. Andrew (Drew) Bagnell and Anind Dey},
title = {Human Behavior Modeling with Maximum Entropy Inverse Optimal Control},
booktitle = {Proceedings of AAAI '09 Spring Symposium on Human Behavior Modeling},
year = {2009},
month = {March},
pages = {92 - 97},
}

Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.