PhD Thesis Proposal: Jiyan Pan
Coherent Scene Understanding with 3D Geometric Reasoning
Carnegie Mellon University
December 03, 2012, 10:00 a.m., NSH 1305
Going beyond object detection and semantic segmentation, coherent scene understanding simultaneously considers multiple potential objects and surfaces in the image and reasons over them in a 3D geometric context to derive a coherent interpretation of the scene behind the image, during which many visual ambiguities can be resolved. To achieve this goal, a coherent scene understanding system should be able to 1) infer 3D geometric properties of objects and surfaces in addition to their semantic labels, 2) identify different types of 3D geometric constraints among objects and surfaces, and 3) leverage those constraints to infer the validity of potential objects and surfaces and to recover the 3D layout of the scene.
In this thesis proposal, we present a coherent scene understanding algorithm that possesses those capabilities. In our approach, object and surfaces are mutually constrained by both global 3D geometries such as gravity direction and ground plane, and local 3D geometries such as depth ordering and space occupancy. We incorporate these two types of 3D geometric context in a RANSAC-CRF framework. More specifically, we use local entities to propose hypotheses of global 3D geometries in a RANSAC manner, and evaluate those hypotheses using a CRF which considers both the consistency of individual objects under global 3D geometric context and the consistency between adjacent objects under local 3D geometric context. We show that performing 3D geometric reasoning on both global and local levels greatly improves object detection and scene layout recovery.
We also propose several possible extensions to our existing system.
Takeo Kanade, Chair
Derek Hoiem, University of Illinois at Urbana-Champaign