Volumetric Correspondence Networks for Stereo Matching and Optical Flow

Gengshan Yang
Master's Thesis, Tech. Report, CMU-RI-TR-19-44, June, 2019

View Publication

Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.

Abstract

Many classic tasks in vision, such as the estimation of optical flow or stereo disparities, can be cast as dense correspondence matching. Well-known techniques for doing so make use of a cost volume, which is typically a 3D/4D tensor of match costs between all pixels in a 2D image and their potential matches in a 1D/2D search window. However, it typically requires significant amounts of memory and compute, leading to various limitations in practice. In this thesis, we investigate efficient and effective ways of incorporating cost volume processing into deep neural networks for correspondence tasks. In particular, we show that 1) for stereo matching on high-resolution images, 3D cost volumes can be efficiently filtered in a coarse-to-fine manner, and 2) for optical flow estimation, “true” 4D volumetric processing can be effectively utilized to improve model’s accuracy as well as generalization ability. As a result, our stereo algorithm achieves state-of-the-art performance on the high-resolution Middlebury benchmark, and our optical flow algorithm is the leading entry among two-frames methods on KITTI as well as MPI Sintel.


@mastersthesis{Yang-2019-116296,
author = {Gengshan Yang},
title = {Volumetric Correspondence Networks for Stereo Matching and Optical Flow},
year = {2019},
month = {June},
school = {},
address = {Pittsburgh, PA},
number = {CMU-RI-TR-19-44},
keywords = {Correspondence matching, Deep learning, Cost volume processing, Coarse-to-fine, High-resolution image processing},
} 2019-07-01T07:52:57-04:00