Factoring Scenes into 3D Structure and Style - Robotics Institute Carnegie Mellon University

Factoring Scenes into 3D Structure and Style

PhD Thesis, Tech. Report, CMU-RI-TR-16-45, Robotics Institute, Carnegie Mellon University, June, 2016

Abstract

Given a single image of a scene, humans have few issues answering questions about its 3D structure like "is this facing upwards?'' even though mathematically speaking this should be impossible. We have similarly have few issues accounting for this 3D structure in answering viewpoint independent questions like "is this the same carpet as the one in your office?'', even if the carpets were viewed from different views and have no pixels in common. At the heart of the issue is that images are the result of two phenomena: the underlying 3D shape, which we call the 3D structure, and canonical texture that is applied to this shape, which we call the style. In the 3D world, these phenomena are distinct, but when we observe the world, they become mixed. Although the identity of both structure and style gets lost in the process, if we know about regularities in both phenomena, we can narrow down the possible combinations that could have produced our image. This dissertation aims to better enable computer to understand images in a 3D way by factoring the image into 3D structure and style. The key is that we can take advantage of regularity in both phenomena to inform our interpretation. For instance, we do not expect carpet texture on ceilings or 75 degree angles between walls. By using regularities, especially ones discovered from large-scale data, we can winnow away the possible combinations of 3D structure and style that could have produced our image.

BibTeX

@phdthesis{Fouhey-2016-5550,
author = {David Fouhey},
title = {Factoring Scenes into 3D Structure and Style},
year = {2016},
month = {June},
school = {Carnegie Mellon University},
address = {Pittsburgh, PA},
number = {CMU-RI-TR-16-45},
}