Embodied One-Shot Video Recognition: Learning from Actions of a Virtual Embodied Agent

Yuqian Fu, Chengrong Wang, Yanwei Fu, Yuxiong Wang, Cong Bai, Xiangyang Xue, and Yu-Gang Jiang

Conference Paper, Proceedings of 27th ACM International Conference on Multimedia (MM '19), pp. 411 - 419, October, 2019

Abstract

One-shot learning aims to recognize novel target classes from few examples by transferring knowledge from source classes, under a general assumption that the source and target classes are semantically related but not exactly the same. Based on this assumption, recent work has focused on image-based one-shot learning, while little work has addressed video-based one shot learning. One of the challenges lies in that it is difficult to maintain the disjoint-class assumption for videos, since video clips of target classes may potentially appear in the videos of source classes. To address this issue, we introduce a novel setting, termed as embodied agents based one-shot learning, which leverages synthetic videos produced in a virtual environment to understand realistic videos of target classes. In this setting, we further propose two types of learning tasks: embodied one-shot video domain adaptation and embodied one-shot video transfer recognition. These tasks serve as a testbed for evaluating video related one-shot learning tasks. In addition, we propose a general video segment augmentation method, which significantly facilitates a variety of one-shot learning tasks. Experimental results validate the soundness of our setting and learning tasks, and also show the effectiveness of our augmentation approach to video recognition in the small-sample size regime.

BibTeX

@conference{Fu-2019-122551,
author = {Yuqian Fu and Chengrong Wang and Yanwei Fu and Yuxiong Wang and Cong Bai and Xiangyang Xue and Yu-Gang Jiang},
title = {Embodied One-Shot Video Recognition: Learning from Actions of a Virtual Embodied Agent},
booktitle = {Proceedings of 27th ACM International Conference on Multimedia (MM '19)},
year = {2019},
month = {October},
pages = {411 - 419},
}

Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.