Data-Driven Geometric Scene Understanding

Scott Satkin
doctoral dissertation, tech. report CMU-RI-TR-13-19, Robotics Institute, Carnegie Mellon University, August, 2013

  • Adobe portable document format (pdf) (55MB)
Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.

In this thesis, we describe a data-driven approach to leverage repositories of 3D models for scene understanding. Our ability to relate what we see in an image to a large collection of 3D models allows us to transfer information from these models, creating a rich understanding of the scene. We develop a framework for auto-calibrating a camera, rendering 3D models from the viewpoint an image was taken, and computing a similarity measure between each 3D model and an input image. We demonstrate this data-driven approach in the context of geometry estimation and show the ability to find the identities, poses and styles of objects in a scene.

We begin by presenting a proof-of-concept algorithm for matching 3D models with input images. Next, we present a series of extensions to this baseline approach. Our goals here are three-fold. First, we aim to produce more accurate reconstructions of a scene by determining both the exact style and size of objects as well as precisely localizing their positions. In addition, we aim to increase the robustness of our scene-matching approach by incorporating new features and expanding our search space to include many viewpoint hypotheses. Lastly, we address the computational challenges of our approach by presenting algorithms for more efficiently exploring the space of 3D scene hypotheses, without sacrificing the quality of results.

We conclude by presenting various applications of our geometric scene understanding approach. We start by demonstrating the effectiveness of our algorithm for traditional ap-plications such as object detection and segmentation. In addition, we present two novel applications incorporating our geometry estimates: affordance estimation and geometry- aware object insertion for photorealistic rendering.

Number of pages: 122

Text Reference
Scott Satkin, "Data-Driven Geometric Scene Understanding," doctoral dissertation, tech. report CMU-RI-TR-13-19, Robotics Institute, Carnegie Mellon University, August, 2013

BibTeX Reference
   author = "Scott Satkin",
   title = "Data-Driven Geometric Scene Understanding",
   booktitle = "",
   school = "Robotics Institute, Carnegie Mellon University",
   month = "August",
   year = "2013",
   number= "CMU-RI-TR-13-19",
   address= "Pittsburgh, PA",