Distributed Value Functions

Jeff Schneider, Weng-Keen Wong, Andrew Moore, and Martin Riedmiller

Conference Paper, Proceedings of (ICML) International Conference on Machine Learning, pp. 371 - 378, June, 1999

View Publication

Abstract

Many interesting problems, such as power grids, network switches, and traffic flow, that are candidates for solving with reinforcement learning (RL), also have properties that make distributed solutions desirable. We propose an algorithm for distributed reinforcement learning based on distributing the representation of the value function across nodes. Each node in the system only has the ability to sense state locally, choose actions locally, and receive reward locally (the goal of the system is to maximize the sum of the rewards over all nodes and over all time). However each node is allowed to give its neighbors the current estimate of its value function for the states it passes through. We present a value function learning rule, using that information, that allows each node to learn a value function that is an estimate of a weighted sum of future rewards for all the nodes in the network. With this representation, each node can choose actions to improve the performance of the overall system.

BibTeX

@conference{Schneider-1999-16667,
author = {Jeff Schneider and Weng-Keen Wong and Andrew Moore and Martin Riedmiller},
title = {Distributed Value Functions},
booktitle = {Proceedings of (ICML) International Conference on Machine Learning},
year = {1999},
month = {June},
pages = {371 - 378},
}

Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.