/Blocks World Revisited: Image Understanding using Qualitative Geometry and Mechanics

Blocks World Revisited: Image Understanding using Qualitative Geometry and Mechanics

Abhinav Gupta, Alexei A. Efros and Martial Hebert
Conference Paper, European Conference on Computer Vision (ECCV), September, 2010

Download Publication (PDF)

Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author’s copyright. These works may not be reposted without the explicit permission of the copyright holder.


Since most current scene understanding approaches operate either on the 2D image or using a surface-based representation, they do not allow reasoning about the physical constraints within the 3D scene. Inspired by the “Blocks World” work in the 1960’s, we present a qualitative physical representation of an outdoor scene where objects have volume and mass, and relationships describe 3D structure and mechanical configurations. Our representation allows us to apply powerful global geometric constraints between 3D volumes as well as the laws of statics in a qualitative manner. We also present a novel iterative “interpretation-by-synthesis” approach where, starting from an empty ground plane, we progressively “build up” a physically-plausible 3D interpretation of the image. For surface layout estimation, our method demonstrates an improvement in performance over the state-of-the-art [9]. But more importantly, our approach automatically generates 3D parse graphs which describe qualitative geometric and mechanical properties of objects and relationships between objects within an image.

BibTeX Reference
author = {Abhinav Gupta and Alexei A. Efros and Martial Hebert},
title = {Blocks World Revisited: Image Understanding using Qualitative Geometry and Mechanics},
booktitle = {European Conference on Computer Vision (ECCV)},
year = {2010},
month = {September},