Home/Multi-Scale Convolutional Architecture for Semantic Segmentation

Multi-Scale Convolutional Architecture for Semantic Segmentation

Aman Raj, Daniel Maturana and Sebastian Scherer
Tech. Report, CMU-RI-TR-15-21, Robotics Institute, Carnegie Mellon University, pp. 14, October, 2015

Download Publication (PDF)

Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author’s copyright. These works may not be reposted without the explicit permission of the copyright holder.


Advances in 3D sensing technologies have made the availability of RGB and Depth information easier than earlier which can greatly assist in the semantic segmentation of 2D scenes. There are many works in literature that perform semantic segmentation in such scenes, but few relates to the environment that possesses a high degree of clutter in general e.g. indoor scenes. In this paper, we explore the use of depth information along with RGB and deep convolutional network for indoor scene understanding through semantic labeling. Our work exploits the geocentric encoding of a depth image and uses a multi-scale deep convolutional neural network architecture that captures high and lowlevel features of a scene to generate rich semantic labels. We apply our method on indoor RGBD images from NYUD2 dataset and achieve a competitive performance of 70.45 % accuracy in labeling four object classes compared with some prior approaches. The results show our system is capable of generating a pixel-map directly from an input image where each pixel-value corresponds to a particular class of object.

BibTeX Reference
title = {Multi-Scale Convolutional Architecture for Semantic Segmentation},
author = {Aman Raj and Daniel Maturana and Sebastian Scherer},
school = {Robotics Institute , Carnegie Mellon University},
month = {October},
year = {2015},
pages = {14},
number = {CMU-RI-TR-15-21},
address = {Pittsburgh, PA},