Visual semantic navigation using scene priors

Wei Yang, Xiaolong Wang, Ali Farhadi, Abhinav Gupta, and Roozbeh Mottaghi

Conference Paper, Proceedings of (ICLR) International Conference on Learning Representations, May, 2019

Abstract

How do humans navigate to target objects in novel scenes? Do we use the semantic/functional priors we have built over years to efficiently search and navigate? For example, to search for mugs, we search cabinets near the coffee machine and for fruits we try the fridge. In this work, we focus on incorporating semantic priors in the task of semantic navigation. We propose to use Graph Convolutional Networks for incorporating the prior knowledge into a deep reinforcement learning framework. The agent uses the features from the knowledge graph to predict the actions. For evaluation, we use the AI2-THOR framework. Our experiments show how semantic knowledge improves the performance significantly. More importantly, we show improvement in generalization to unseen scenes and/or objects.

BibTeX

@conference{Yang-2019-113270,
author = {Wei Yang and Xiaolong Wang and Ali Farhadi and Abhinav Gupta and Roozbeh Mottaghi},
title = {Visual semantic navigation using scene priors},
booktitle = {Proceedings of (ICLR) International Conference on Learning Representations},
year = {2019},
month = {May},
}

Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.