Learning by Asking Questions

Ishan Misra, Ross Girshick, Rob Fergus, Martial Hebert, Abhinav Gupta, and Laurens van der Maaten

Conference Paper, Proceedings of (CVPR) Computer Vision and Pattern Recognition, pp. 11 - 20, June, 2018

View Publication

Abstract

We introduce an interactive learning framework for the development and testing of intelligent visual systems, called learning-by-asking (LBA). We explore LBA in context of the Visual Question Answering (VQA) task. LBA differs from standard VQA training in that most questions are not observed during training time, and the learner must ask questions it wants answers to. Thus, LBA more closely mimics natural learning and has the potential to be more data-efficient than the traditional VQA setting. We present a model that performs LBA on the CLEVR dataset, and show that it automatically discovers an easy-to-hard curriculum when learning interactively from an oracle. Our LBA generated data consistently matches or outperforms the CLEVR train data and is more sample efficient. We also show that our model asks questions that generalize to state-of-the-art VQA models and to novel test time distributions.

BibTeX

@conference{Misra-2018-105535,
author = {Ishan Misra and Ross Girshick and Rob Fergus and Martial Hebert and Abhinav Gupta and Laurens van der Maaten},
title = {Learning by Asking Questions},
booktitle = {Proceedings of (CVPR) Computer Vision and Pattern Recognition},
year = {2018},
month = {June},
pages = {11 - 20},
keywords = {Visual Question Answering; VQA; Interactive Learning; Computer Vision; Natural Language Processing; Vision and Language},
}

Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.