Carnegie Mellon Robotics Institute
J. Andrew (Drew) Bagnell and Andrew Ng
Neural Information Processing Systems, May, 2006.
| Download |
|
| Abstract |
| We consider the scaling of the number of examples necessary to achieve good performance in distributed, cooperative, multi-agent reinforcement learning, as a function of the the number of agents n. We prove a worst-case lower bound showing that algorithms that rely solely on a global reward signal to learn policies confront a fundamental limit: They require a number of real-world examples that scales roughly linearly in the number of agents. For settings of interest with a very large number of agents, this is impractical. We demonstrate, however, that there is a class of algorithms that, by taking advantage of local reward signals in large distributed Markov Decision Processes, are able to ensure good performance with a number of samples that scales as O(log n). This makes them applicable even in settings with a very large number of agents n. |
| Keywords |
| multi-agent, learning theory, reinforcement learning, Markov Decision Processes, local reward, distributed RL |
| Notes |
| Text Reference |
| J. Andrew (Drew) Bagnell and Andrew Ng, "On Local Rewards and Scaling Distributed Reinforcement Learning," Neural Information Processing Systems, May, 2006. |
| BibTeX Reference |
|
@inproceedings{Bagnell_2006_5326, author = "J. Andrew (Drew) Bagnell and Andrew Ng", title = "On Local Rewards and Scaling Distributed Reinforcement Learning", booktitle = "Neural Information Processing Systems", publisher = "MIT Press", month = "May", year = "2006", } |
| The Robotics Institute is part of the School of Computer Science, Carnegie Mellon University. Contact Us | Update Instructions |