Exploration with Expert Policy Advice - Robotics Institute Carnegie Mellon University

Exploration with Expert Policy Advice

Miscellaneous, August, 2018

Abstract

Exploration for Reinforcement Learning is a challenging problem. Random exploration is often highly inefficient and in sparse reward environments may completely fail. In this work, we developed a novel method that incorporates expert advice for exploration in sparse reward environments. In our formulation, the agent has access to a set of expert policies and learns to bias its exploration based on the experts' suggested actions. By incorporating expert suggestions the agent is able to quickly learn a policy to reach rewarding states. Our method can mix and match experts' advice during an episode to reach goal states. Moreover, our formulation does not restrict the agent to any policy set. This allows us to aim for a globally optimal solution. In our experiments, we show that using expert advice indeed leads to faster exploration in challenging grid-world environments.

BibTeX

@misc{Khadke-2018-118006,
author = {Ashwin Khadke and Arpit Agarwal and Anahita Mohseni Kabir and Devin Schwab},
title = {Exploration with Expert Policy Advice},
month = {August},
year = {2018},
keywords = {Reinforcement Learning, Exploration, Learning from Expert Advice},
}