Home/3D Face Geometry Capture Using Monocular Video

3D Face Geometry Capture Using Monocular Video

Shubham Agrawal
Master's Thesis, Tech. Report, CMU-RI-TR-19-47, May, 2019

Download Publication

Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.


Accurate reconstruction of facial geometry has been one of the oldest tasks in computer vision. Despite being a long-studied problem, many modern methods fail to reconstruct realistic looking faces or rely on highly constrained environments for capture. High fidelity face reconstructions have so far been limited to either studio settings
or through expensive 3D scanners. On the other hand, unconstrained reconstruction methods are typically limited by low-capacity models. We aim to capture face geometry with high fidelity using just a single monocular video sequence of the face.

Our method reconstructs accurate face geometry of a subject using a video shot from a smartphone in an unconstrained environment. Our approach takes advantage of recent advances in visual SLAM, keypoint detection, and object detection to improve accuracy and robustness. By not being constrained to a model subspace, our reconstructed meshes capture important details while being robust to noise and being topologically consistent. Our evaluations show that our method outperforms current single and multi-view baselines by a significant margin, both in terms of geometric accuracy and in capturing person-specific details important for making realistic looking models.

To further the current work on single and multi-view 3D face reconstruction, we also propose a dataset of video sequences of individuals, specifically with the goal to improve deep-learning based reconstruction techniques using self-supervision as a training loss.

author = {Shubham Agrawal},
title = {3D Face Geometry Capture Using Monocular Video},
year = {2019},
month = {May},
school = {},
address = {Pittsburgh, PA},
number = {CMU-RI-TR-19-47},
keywords = {3D, reconstruction, computer vision, geometry, face reconstruction},
} 2019-07-01T15:38:01-04:00