Volumetric Correspondence Networks for Stereo Matching and Optical Flow - Robotics Institute Carnegie Mellon University

Volumetric Correspondence Networks for Stereo Matching and Optical Flow

Master's Thesis, Tech. Report, CMU-RI-TR-19-44, Robotics Institute, Carnegie Mellon University, June, 2019

Abstract

Many classic tasks in vision, such as the estimation of optical flow or stereo disparities, can be cast as dense correspondence matching. Well-known techniques for doing so make use of a cost volume, which is typically a 3D/4D tensor of match costs between all pixels in a 2D image and their potential matches in a 1D/2D search window. However, it typically requires significant amounts of memory and compute, leading to various limitations in practice. In this thesis, we investigate efficient and effective ways of incorporating cost volume processing into deep neural networks for correspondence tasks. In particular, we show that 1) for stereo matching on high-resolution images, 3D cost volumes can be efficiently filtered in a coarse-to-fine manner, and 2) for optical flow estimation, "true" 4D volumetric processing can be effectively utilized to improve model's accuracy as well as generalization ability. As a result, our stereo algorithm achieves state-of-the-art performance on the high-resolution Middlebury benchmark, and our optical flow algorithm is the leading entry among two-frames methods on KITTI as well as MPI Sintel.

BibTeX

@mastersthesis{Yang-2019-116296,
author = {Gengshan Yang},
title = {Volumetric Correspondence Networks for Stereo Matching and Optical Flow},
year = {2019},
month = {June},
school = {Carnegie Mellon University},
address = {Pittsburgh, PA},
number = {CMU-RI-TR-19-44},
keywords = {Correspondence matching, Deep learning, Cost volume processing, Coarse-to-fine, High-resolution image processing},
}