Carnegie Mellon Robotics Institute
My research area is computer vision, with an emphasis on video understanding, multi-image stereo, and 3D site modeling. The underlying theme of my work has been understanding the geometry of vision.
Video Surveillance and Monitoring: The DARPA Video Surveillance and Monitoring (VSAM) project is currently my main focus of research. We are developing automated video understanding technology that will enable a single human operator to monitor activities over a large, complex area using a distributed network of video sensors. Sample applications include building and parking lot security, monitoring peace treaties using unmanned air vehicles, and performing military reconnaissance on the battlefield. Video is a challenging medium, due to the relatively poor resolution of each video frame, combined with the fact that new frames are streaming in 30 times per second. Our research topics include calibrating a large outdoor network of active video sensors, automatic detection, classification and tracking of people and vehicles, hand-off of active tracking control between sensors with disparate viewpoints, long-term scene monitoring looking for significant trigger events such as break-ins or loitering, object geolocation using model-based methods and wide-baseline stereo, and full 3D visualization of scene activities using a distributed simulation software package.
Multi-Image Stereo: The problem of determining feature correspondence across multiple, widely-spaced views is quite difficult. Treating this as a problem of generalized, multi-image stereo, I have developed the SPACE-SWEEP STEREO method for efficiently bringing local image patches from potential multi-image correspondences into proximity, where they can be tested for compatibility. All possible multi-image correspondences are efficiently generated and tested by a geometric algorithm that sweeps a virtual planar surface through the scene, in a manner closely related to the active vision method called zero-disparity filtering. Furthermore, I have recently shown that a space-sweep stereo algorithm can scan volumes of the scene to determine whether they contain a statistically significant number of structural edges, without first performing precise reconstruction of those edges. This result enables us to develop a rapid focus of attention mechanism for locating areas of the scene that contain a significant amount of 3D structure.
Aerial Site Modeling: In recent years there has been an enormous increase in the availability of high-resolution aerial imagery from airborne and satellite sensing systems devoted to mapping, reconnaissance, and earth-resource management. Concurrently, there has been a growing need to produce site models that include man-made cartographic features such as buildings and roads. Under the DARPA RADIUS project, my colleagues and I developed an image understanding system called ASCENDER that can automatically extract building models from a set of images to produce a realistic, texture-mapped, 3D rendition of the site. We learned several lessons from this project. First, 3D building reconstruction should be based on geometric features such as line segments that remain stable under a wide range of viewing and lighting conditions. Second, use of rigorous photogrammetric camera models is essential for combining information from diverse sensor modalities and viewpoints. Finally, we learned that the efficiency and reliability of the building reconstruction process can be greatly increased by introducing collateral site information in the form of digital elevation maps and generic building model constraints such as coplanarity and perpendicularity of roof edges.
|Research Interest Keywords|
|computer vision, stereo vision|
|The Robotics Institute is part of the School of Computer Science, Carnegie Mellon University.|
Contact Us | Update Instructions