Learning Transferable Cooperative Behavior in Multi-Agent Teams

Akshat Agarwal
Master's Thesis, Tech. Report, CMU-RI-TR-19-10, May, 2019

View Publication

Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.


A multi-agent system comprises of multiple interacting intelligent agents, collaborating together to solve problems that are difficult or impossible for a single agent to solve, with the goal of maximizing their shared utility. We study the emergence of cooperative behavior and communication protocols in multi-agent teams, for collaboratively accomplishing tasks like resource allocation and formation control for swarms. While multi-agent interactions can be naturally modeled as graphs, the environment has traditionally been considered as a black box.
We propose creating a shared agent-entity graph, where agents and environmental entities form vertices, and edges exist between the vertices allowed to communicate with each other, allowing agents to selectively attend to different parts of the environment, while also introducing invariance to the number of agents/entities as well as permutation invariance, desirable properties for any multi-agent system representation. We present state-of-the-art results on coverage and formation control for swarms in a fully decentralized execution framework, and show that the learned policies have strong zero-shot generalization to scenarios with different team sizes. Additionally, we introduce communication dropout for robustness to glitches, and find that it also aids learning as a regularizer. This is an important step towards swarms which can be realistically deployed in the real world without assuming complete prior knowledge or instantaneous communication at unbounded distances.

author = {Akshat Agarwal},
title = {Learning Transferable Cooperative Behavior in Multi-Agent Teams},
year = {2019},
month = {May},
school = {},
address = {Pittsburgh, PA},
number = {CMU-RI-TR-19-10},
keywords = {multi-agent learning, reinforcement learning, graph networks, multi-agent representations, swarm behavior, cooperative behavior, teams},
} 2019-05-14T08:01:32-04:00