Occlusion Boundaries: Low-Level Detection to High-Level Reasoning

Andrew Stein
doctoral dissertation, tech. report CMU-RI-TR-08-06, Robotics Institute, Carnegie Mellon University, May, 2008

  • Adobe portable document format (pdf) (28MB)
Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.

The boundaries of ob jects in an image are often considered a nuisance to be "handled" due to the occlusion they exhibit. Since most, if not all, computer vision techniques aggregate information spatially within a scene, information spanning these boundaries, and therefore from different physical surfaces, is invariably and erroneously considered together. In addition, these boundaries convey important perceptual information about 3D scene structure and shape. Consequently, their identification can benefit many different computer vision pursuits, from low-level processing techniques to high-level reasoning tasks.

While much focus in computer vision is placed on the processing of individual, static images, many applications actually offer video, or sequences of images, as input. The extra temporal dimension of the data allows the motion of the camera or the scene to be used in processing. In this thesis, we focus on the exploitation of subtle relative-motion cues present at occlusion boundaries. When combined with more standard appearance information, we demonstrate these cues?utility in detecting occlusion boundaries locally. We also present a novel, mid-level model for reasoning more globally about ob ject boundaries and propagating such local information to extract improved, extended boundaries.

Building on these methods, we also demonstrate enhancement of two high-level vision tasks by incorporating boundary information. First we employ boundary fragments to suggest multiple "hints" of a scene segmentation and then use these suggestions collectively to achieve more consistent and parsimonious delineation of generic whole ob jects. Second, we augment a popular feature-based recognition technique for specific objects (the Scale Invariant Feature Transform) with boundary information in order to yield a method more robust to changes in background and scale.

This thesis thus contributes to research on occlusion at several levels, from low-level motion estimation and feature extraction; to mid-level reasoning, classification, and propagation; and finally to high-level segmentation and recognition. In addition, a new video dataset is presented to enable further research in this area.

occlusion, motion boundaries, object segmentation, occlusion detection, motion segmentation, object recognition


Text Reference
Andrew Stein, "Occlusion Boundaries: Low-Level Detection to High-Level Reasoning," doctoral dissertation, tech. report CMU-RI-TR-08-06, Robotics Institute, Carnegie Mellon University, May, 2008

BibTeX Reference
   author = "Andrew Stein",
   title = "Occlusion Boundaries: Low-Level Detection to High-Level Reasoning",
   booktitle = "",
   school = "Robotics Institute, Carnegie Mellon University",
   month = "May",
   year = "2008",
   number= "CMU-RI-TR-08-06",
   address= "Pittsburgh, PA",