High Precision Image Matching and Shape Recovery - Robotics Institute Carnegie Mellon University

High Precision Image Matching and Shape Recovery

PhD Thesis, Tech. Report, CMU-RI-TR-95-35, Robotics Institute, Carnegie Mellon University, September, 1995

Abstract

In this thesis, we examine several depth and shape recovery methods from a methodological point of view. Specifically, we concentrate our efforts on approaches which require only very limited and realistic assumptions. These approaches are: depth from stereo, depth from focus, and structure from motion. They all start with image matching as the first step, and then they map the matching parameters into depth or shape information. At first, we examine the traditional Gabor filter-based approach and propose to use a general criterion to detect situations where the filter outputs are severely contaminated by window effects. By eliminating the contaminated information, we can improve the precision of image matching. Then we go further to analyze the originations of window effects and propose "moment" filters. Due to the recursiveness of moment filters in the spatial and Fourier domains, we can directly model window effects and foreshortenting effects analytically, and achieve much higher precision than with traditional approaches. Finally, we go beyond moment filters to propose "hypergeometric" filters which, in addition to modeling window and foreshortenting effects, sample the frequency domain canonically. We will also look into the classic problem of structure from motion, which recovers 3D structural information from optical flows resulting from unknown camera motions. We will describe a structure-from-motion system based on the Extended Kalman Filter (EKF), which is capable of incrementally generating 3D dense depth maps from an optical flow sequence. Most traditional feature-based approaches cannot be extended to compute dense structure due to the impractical computational complexity. We demonstrate that, by decomposing uncertainty information into independent and correlated parts, we can decrease the complexity from O(N2) to O(N), where N is the number of pixels in the images. We also show that this dense structure-from-motion system requires only local optical flows, i.e., image matchings between two adjacent frames, instead of tracking of features over a long sequence of frames. We will demonstrate such a system on real image sequences.

BibTeX

@phdthesis{Xiong-1995-13989,
author = {Yalin Xiong},
title = {High Precision Image Matching and Shape Recovery},
year = {1995},
month = {September},
school = {Carnegie Mellon University},
address = {Pittsburgh, PA},
number = {CMU-RI-TR-95-35},
}