Learning Exploration Policies for Navigation

Tao Chen, Saurabh Gupta, and Abhinav Gupta

Conference Paper, Proceedings of (ICLR) International Conference on Learning Representations, May, 2019

Abstract

Numerous past works have tackled the problem of task-driven navigation. But, how to effectively explore a new environment to enable a variety of down-stream tasks has received much less attention. In this work, we study how agents can autonomously explore realistic and complex 3D environments without the context of task-rewards. We propose a learning-based approach and investigate different policy architectures, reward functions, and training paradigms. We find that use of policies with spatial memory that are bootstrapped with imitation learning and finally finetuned with coverage rewards derived purely from on-board sensors can be effective at exploring novel environments. We show that our learned exploration policies can explore better than classical approaches based on geometry alone and generic learning-based exploration techniques. Finally, we also show how such task-agnostic exploration can be used for down-stream tasks. Videos are available at https://sites.google.com/view/exploration-for-nav/.

BibTeX

@conference{Chen-2019-113265,
author = {Tao Chen and Saurabh Gupta and Abhinav Gupta},
title = {Learning Exploration Policies for Navigation},
booktitle = {Proceedings of (ICLR) International Conference on Learning Representations},
year = {2019},
month = {May},
}

Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.