PhD Thesis Defense
Carnegie Mellon University
3:30 pm to 5:30 pm
Realistic human avatars play a key role in immersive virtual telepresence. To reach a high level of realism, a human avatar needs to faithfully reflect human appearance. A human avatar should also be drivable and express natural motions. Existing works have made significant progress in building drivable realistic face avatars, but they rarely include realistic dynamic hair despite its importance in human appearance. In pursuit of drivable, realistic human avatars with dynamic hair, we focus on the problem of automatically capturing and animating hair from multiview videos.
We first look into the problem of capturing the motion of the head with nearstatic hair. Because the hair has complex geometry, we use a neural volumetric representation that can be rendered efficiently and photorealistically. To learn such a representation, we employ an ’analysis-by-synthesis’ strategy that optimizes the representation with the gradient from the reconstruction loss on 2D via differentiable volumetric rendering.
Then we extend the problem to capturing hair with dynamics. To accommodate the complexity introduced by the temporal dimension, data-priors on motion like optical flow and point flow are leveraged as additional supervision. To be more specific, we first perform tracking on hair strands with a data prior on motion. In the next step, we attach volumetric primitives to the tracked hair strands to learn the fine-level appearance and geometry via differentiable rendering. We further design a differentiable volumetric rendering algorithm with the optical flow to ensure temporal smoothness at a fine level.
We then address the problem of building a hair dynamic model for animation. In contrast to the previous two problems that focus on reconstructing 3D/4D, the main difficulty here lies in generating novel animation. We present a two-stage pipeline to build a hair dynamic model in a data-driven manner. The first stage performs hair state compression using an autoencoder-as-a-tracker strategy. The second stage learns a hair dynamic model in a supervised manner using the hair state data from the first stage. The hair dynamic model is designed to perform hair state transitions conditioned on head motions and head relative gravity direction.
In parallel to capturing and animating specific hairstyles, we explored the problem of how to efficiently capture diverse hair appearances. Hair plays a significant role in personal identity and the efficient creation of personalized avatars with decent hair is essential to individual usages. To handle the large intra-class variance in hair appearance and geometry, we present a universal hair appearance model that focuses on the similarity between different hairstyles in a local region. The model takes 3D-aligned features as input and learns a unified manifold of local hair appearance that adaptively generates appearance for hairstyles with diverse topologies.
Thesis Committee Members:
Jessica Hodgins, Chair
Fernando De La Torre
Michael Zollhoefer, Reality Labs Research
Kalyan Sunkavalli, Adobe Research