Few-shot Learning for Segmentation

Master's Thesis, Tech. Report, CMU-RI-TR-19-35, Robotics Institute, Carnegie Mellon University, July, 2019

View Publication

Abstract

Most learning architectures for segmentation task require a significant amount of data and annotations, especially in the task of segmentation, where each pixel is assigned to a class. Few-shot segmentation aims to replace large amount of training data with only a few densely annotated samples. In this paper, we propose a two-branch network, FuseNet, that can few-shot segment an input image, i.e. query image, given one or multiple images of the target domain, i.e. support images. FuseNet preserves the local context around the target domain by masking out the non-target region in the feature space. The network then leverages the cosine similarity between the masked features from the support and the feature from the query as guidance to predict the segmentation mask. In the case of few-shot, we weigh such guidance differently according to their image-level feature similarity with the query. We also explore the quantitative effects of number of support images on Intersection over Union(IoU). Our network achieves the state-of-the-art result on PASCAL VOC 2012 for both one-shot and five-shot semantic segmentation.

BibTeX

@mastersthesis{Dai-2019-116358,
author = {Chia Dai},
title = {Few-shot Learning for Segmentation},
year = {2019},
month = {July},
school = {Carnegie Mellon University},
address = {Pittsburgh, PA},
number = {CMU-RI-TR-19-35},
keywords = {Learning, Semantic Segmentation, One-Shot, Few-Shot, Representation Learning},
}

Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.