3D Face Geometry Capture Using Monocular Video - Robotics Institute Carnegie Mellon University

3D Face Geometry Capture Using Monocular Video

Master's Thesis, Tech. Report, CMU-RI-TR-19-47, Robotics Institute, Carnegie Mellon University, May, 2019

Abstract

Accurate reconstruction of facial geometry has been one of the oldest tasks in computer vision. Despite being a long-studied problem, many modern methods fail to reconstruct realistic looking faces or rely on highly constrained environments for capture. High fidelity face reconstructions have so far been limited to either studio settings or through expensive 3D scanners. On the other hand, unconstrained reconstruction methods are typically limited by low-capacity models. We aim to capture face geometry with high fidelity using just a single monocular video sequence of the face.

Our method reconstructs accurate face geometry of a subject using a video shot from a smartphone in an unconstrained environment. Our approach takes advantage of recent advances in visual SLAM, keypoint detection, and object detection to improve accuracy and robustness. By not being constrained to a model subspace, our reconstructed meshes capture important details while being robust to noise and being topologically consistent. Our evaluations show that our method outperforms current single and multi-view baselines by a significant margin, both in terms of geometric accuracy and in capturing person-specific details important for making realistic looking models.

To further the current work on single and multi-view 3D face reconstruction, we also propose a dataset of video sequences of individuals, specifically with the goal to improve deep-learning based reconstruction techniques using self-supervision as a training loss.

BibTeX

@mastersthesis{Agrawal-2019-116312,
author = {Shubham Agrawal},
title = {3D Face Geometry Capture Using Monocular Video},
year = {2019},
month = {May},
school = {Carnegie Mellon University},
address = {Pittsburgh, PA},
number = {CMU-RI-TR-19-47},
keywords = {3D, reconstruction, computer vision, geometry, face reconstruction},
}