cuMASM: Realtime Automatic Facial Landmarking using Active Shape Models on Graphics Processor Units

Nicholas Alexander Vandal

Master's Thesis, Tech. Report, CMU-RI-TR-11-16, May, 2011

View Publication

Abstract

Automatic, robust, and accurate landmarking of dense sets of facial features is a key
component in face-based biometric identification systems. Among other uses, dense
landmarking is used to normalize raw faces for scale perform facial expression analysis,
and is an essential component for generating 3D face models from a single 2D
image. Active shape models (ASMs), which incorporate constrained statistical models
of shape with local texture models of each landmark, have been applied successfully to
this problem as well as landmarking tasks in other domains. Recent work has demonstrated
that Modified Active Shape Models (MASMs), which utilize improved subspace
models of 2D landmark neighborhoods, generalize better to unseen faces and to
real-world dynamic environments. This superior performance comes with a significant
computational cost, on the order of seconds per image to reach convergence. Compounded
with the time required for face detection on high-resolution images, robust
facial landmarking on the CPU is decidedly not realtime even for a well-optimized,
multithreaded C++ implementation. In this paper, we demonstrate realtime MASM facial
landmarking by parallelizing the algorithm on Graphics Processing Units (GPUs)
using the CUDA programming platform. Our GPU-based implementation is designed
for integration into a larger face recognition routine and is able to accept updated model
parameters without recompilation or re-synthesis. Unlike previous GPU-based ASM
implementations, which parallelize the original ASM algorithm utilizing 1D profiles,
we implement the 2D subspace-modeled profile searching of the more robust MASM
technique. We report GPU speedups of 24X over single-threaded CPU implementations
of MASM and approximately 12X over a 8-threaded CPU implementation. By
leveraging this untapped source of computational power, we are able to achieve realtime
frame rates of approximately 20 FPS using a 79-point landmarking scheme. We
discuss parallelizing the facial landmarking fitting process, specific GPU implementation
details, GPU architecture-specific optimizations required to take advantage of the
underlying hardware, and general CUDA programming concepts.

BibTeX

@mastersthesis{Vandal-2011-112696,
author = {Nicholas Alexander Vandal},
title = {cuMASM: Realtime Automatic Facial Landmarking using Active Shape Models on Graphics Processor Units},
year = {2011},
month = {May},
school = {Carnegie Mellon University},
address = {Pittsburgh, PA},
number = {CMU-RI-TR-11-16},
}

Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.