Learning a Predictable and Generative Vector Representation for Objects - Robotics Institute Carnegie Mellon University

Learning a Predictable and Generative Vector Representation for Objects

Master's Thesis, Tech. Report, CMU-RI-TR-16-06, Robotics Institute, Carnegie Mellon University, April, 2016

Abstract

What is a good vector representation of an object? We believe that it should be generative in 3D, in the sense that it can produce new 3D objects; as well as be predictable from 2D, in the sense that it can be perceived from 2D images. We propose a novel architecture, called the TL-embedding network, to learn an embedding space with these properties. The network consists of two components: (a) an autoencoder that ensures the representation is generative; and (b) a convolutional network that ensures the representation is predictable. This enables tackling a number of tasks including voxel prediction from 2D images and 3D model retrieval. Extensive experimental analysis demonstrates the usefulness and versatility of this embedding.

BibTeX

@mastersthesis{Girdhar-2016-5494,
author = {Rohit Girdhar},
title = {Learning a Predictable and Generative Vector Representation for Objects},
year = {2016},
month = {April},
school = {Carnegie Mellon University},
address = {Pittsburgh, PA},
number = {CMU-RI-TR-16-06},
}