/Structure Discovery in Multi-modal Data : a Region-based Approach

Structure Discovery in Multi-modal Data : a Region-based Approach

Alvaro Collet Romea, Siddhartha Srinivasa and Martial Hebert
Conference Paper, 2011 IEEE International Conference on Robotics and Automation, May, 2011

Download Publication (PDF)

Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author’s copyright. These works may not be reposted without the explicit permission of the copyright holder.


The ability of a perception system to discern what is important in a scene and what is not is an invaluable asset, with multiple applications in object recognition, people detection and SLAM, among others. In this paper, we aim to analyze all sensory data available to separate a scene into a few physically meaningful parts, which we term structure, while discarding background clutter. In particular, we consider the combination of image and range data, and base our decision in both appearance and 3D shape. Our main contribution is the development of a framework to perform scene segmentation that preserves physical objects using multi-modal data. We combine image and range data using a novel mid-level fusion technique based on the concept of regions that avoids any pixel-level correspondences between data sources. We associate groups of pixels with 3D points into multi-modal regions that we term regionlets, and measure the structure-ness of each regionlet using simple, bottom-up cues from image and range features. We show that the highest-ranked regionlets correspond to the most prominent objects in the scene. We verify the validity of our approach on 105 scenes of household environments.

BibTeX Reference
author = {Alvaro Collet Romea and Siddhartha Srinivasa and Martial Hebert},
title = {Structure Discovery in Multi-modal Data : a Region-based Approach},
booktitle = {2011 IEEE International Conference on Robotics and Automation},
year = {2011},
month = {May},