Home/Real-time 3D Scene Layout from a Single Image Using Convolutional Neural Networks

Real-time 3D Scene Layout from a Single Image Using Convolutional Neural Networks

Shichao Yang, Daniel Maturana and Sebastian Scherer
Conference Paper, International Conference on Robotics and Automation (ICRA), May, 2016

Download Publication (PDF)

Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author’s copyright. These works may not be reposted without the explicit permission of the copyright holder.

Abstract

We consider the problem of understanding the 3D layout of indoor corridor scenes from a single image in real time. Identifying obstacles such as walls is essential for robot navigation, but also challenging due to the diversity in structure, appearance and illumination of real-world corridor scenes. Many current single-image methods make Manhattan-world assumptions, and break down in environments that do not meet this mold. They also may require complicated hand-designed features for image segmentation or clear boundaries to form certain building models. In addition, most cannot run in real time. In this paper, we propose to combine machine learning with geometric modelling to build a simplified 3D model from a single image. We first employ a supervised Convolutional Neural Network (CNN) to provide a dense, but coarse, geometric class labelling of the scene. We then refine this labelling with a fully connected Conditional Random Field (CRF). Finally, we fit line segments along wall-ground boundaries and “pop up” a 3D model using geometric constraints. We assemble a dataset of 967 labelled corridor images. Our experiments on this dataset and another publicly available dataset show our method outperforms other single image scene understanding methods in pixelwise accuracy while labelling images at over 15Hz.

BibTeX Reference
@conference{Yang-2016-5499,
title = {Real-time 3D Scene Layout from a Single Image Using Convolutional Neural Networks},
author = {Shichao Yang and Daniel Maturana and Sebastian Scherer},
booktitle = {International Conference on Robotics and Automation (ICRA)},
month = {May},
year = {2016},
}
2017-09-13T10:38:28+00:00