Dopamine and inference about timing

Nathaniel Daw, Aaron Courville, and David S. Touretzky
Proceedings of the IEEE Second International Conference on Development and Learning, 2002, pp. 271 - 276.


Download
  • Adobe portable document format (pdf) (147KB)
Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.

Abstract
Temporal-difference learning (TD) models explain most responses of primate dopamine neurons in appetitive conditioning. But because existing models are based in the simple formal setting of Markov processes, they do not provide a realistic account of the partial observability of the state of the world, nor of variation in event timing. For instance, the TD model of Montague et al. (1996) mispredicts the dopamine response when an expected reward is delivered early. We explain such experimental results using a version of TD learning grounded in the richer formalism of partially observable semi-Markov processes. We propose that the brain infers the likely state of the world from limited observations, using a statistical model of how the world's state evolves. Inference is necessary for such judgements as whether an expected reward is merely late, versus having been omitted altogether. The dopamine signal is modeled as a TD error signal for learning to predict future rewards from this inferred state representation.

Notes

Text Reference
Nathaniel Daw, Aaron Courville, and David S. Touretzky, "Dopamine and inference about timing," Proceedings of the IEEE Second International Conference on Development and Learning, 2002, pp. 271 - 276.

BibTeX Reference
@inproceedings{Daw_2002_4312,
   author = "Nathaniel Daw and Aaron Courville and David S Touretzky",
   title = "Dopamine and inference about timing",
   booktitle = "Proceedings of the IEEE Second International Conference on Development and Learning",
   pages = "271 - 276",
   year = "2002",
}