Visual Hull Construction, Alignment and Refinement for Human Kinematic Modeling, Motion Tracking and Rendering

PhD Thesis, Tech. Report, CMU-RI-TR-03-44, Robotics Institute, Carnegie Mellon University, October, 2003

View Publication

Abstract

The abilities to build precise human kinematic models and to perform accurate human motion tracking are essential in a wide variety of applications such as ergonomic design, biometrics, anthropological studies, entertainment, human computer interfaces for intelligent environments, and surveillance. Due to the complexity of the human bodies and the problem of self-occlusion, modeling and tracking humans using cameras are challenging tasks. In this thesis, we develop algorithms to perform these two tasks based on the shape estimation method Shape-From-Silhouette (SFS) which constructs a shape estimate (known as Visual Hull) of an object using its silhouettes images. In the first half of this thesis we extend the traditional SFS algorithm so that it can be used effectively for human kinematic modeling and motion tracking. Though popular and easy to implement, traditional SFS has two serious disadvantages which greatly limit its use in human related applications. First of all, SFS involves time-consuming testing steps which make it inefficient in real-time applications. Moreover, building detailed human body models using SFS is difficult unless we use a large number of cameras because Visual Hull built from small number of silhouette images is coarse. We address the first problem by proposing a fast testing/projection algorithm for voxel-based SFS algorithms. To deal with the second problem, we combine silhouette information over time to effectively increase the number of cameras without physically adding new cameras. We first propose a new Visual Hull representation called Bounding Edges. We then analyze the ambiguity problem of aligning two Visual Hulls. Based on the analysis, we develop an algorithm to align Visual Hulls over time using stereo and an important property of the Shape-From-Silhouette principle. This temporal SFS algorithm combines both geometric constraints and photometric consistency to align Colored Surface Points of the object extracted from the silhouette and color images. Once the Visual Hulls are aligned, they are refined by compensating for the motion of the object. The algorithm is developed for both rigid and articulated objects. In the second half of this thesis we show how the improved SFS algorithms are used to perform the tasks of human modeling and motion tracking. First we build a system to acquire human kinematic models consisting of precise shape (constructed using the rigid object temporal SFS algorithm) and joint locations (estimated using the SFS algorithm for articulated objects). Once the kinematic models are built, they are used to track the motion of the person in new video sequences. The tracking algorithm is based on the Visual Hull alignment idea used in the temporal SFS algorithms. Finally we demonstrate how the kinematic model and the tracked motion data can be used for image-based rendering and motion transfer between two people.

BibTeX

@phdthesis{Cheung-2003-8777,
author = {Kong Man Cheung},
title = {Visual Hull Construction, Alignment and Refinement for Human Kinematic Modeling, Motion Tracking and Rendering},
year = {2003},
month = {October},
school = {Carnegie Mellon University},
address = {Pittsburgh, PA},
number = {CMU-RI-TR-03-44},
keywords = {Temporal Shape-From-Silhouette, Visual Hull Alignment, Human Kinematic Modeling, Markeless Motion Tracking, Motion Rendering and Transfer},
}

Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.