Automatic Recognition of Facial Expressions Using Hidden Markov Models and Estimation of Expression Intensity - Robotics Institute Carnegie Mellon University

Automatic Recognition of Facial Expressions Using Hidden Markov Models and Estimation of Expression Intensity

Jenn-Jier James Lien
Miscellaneous, PhD Thesis, CMU-RI-TR-98-31, Electrical Engineering, University of Pittsburgh, April, 1998

Abstract

Facial expressions provide sensitive cues about emotional responses and play a major role in the study of psychological phenomena and the development of nonverbal communication. Facial expressions regulate social behavior, signal communicative intent, and are related to speech production. Most facial expression recognition systems focus on only six basic expressions. In everyday life, however, these six basic expressions occur relatively infrequently, and emotion or intent is more often communicated by subtle changes in one or two discrete features, such as tightening of the lips which may communicate anger. Humans are capable of producing thousands of expressions that vary in complexity, intensity, and meaning. The objective of this dissertation is to develop a computer vision system, including both facial feature extraction and recognition, that automatically discriminates among subtly different facial expressions based on Facial Action Coding System (FACS) action units (AUs) using Hidden Markov Models (HMMs). Three methods are developed to extract facial expression information for automatic recognition. The first method is facial feature point tracking using the coarse-to-fine pyramid method, which can be sensitive to subtle feature motion and is capable to handle large displacements with subpixel accuracy. The second is dense flow tracking together with principal component analysis, where the entire facial motion information per frame is compressed to a low-dimensional weight vector for discrimination. And the third is high gradient component (i.e., furrow) analysis in the spatio-temporal domain, which exploits the transient variance associated with the facial expression. Upon extraction of the facial information, non-rigid facial expressions are separated from the rigid head motion components, and the face images are automatically aligned and normalized using an affine transformation. The resulting motion vector sequence is vector quantized to provide input to an HMM-based classifier, which addresses the time warping problem. A method is developed for determining the HMM topology optimal for our recognition system. The system also provides expression intensity estimation, which has significant effect on the actual meaning of the expression. We have studied more than 400 image sequences obtained from 90 subjects. The experimental results of our trained system showed an overall recognition accuracy of 87%, and also 87% in distinguishing among sets of three and six subtly different facial expressions for upper and lower facial regions, respectively.

BibTeX

@misc{Lien-1998-14624,
author = {Jenn-Jier James Lien},
title = {Automatic Recognition of Facial Expressions Using Hidden Markov Models and Estimation of Expression Intensity},
booktitle = {PhD Thesis, CMU-RI-TR-98-31, Electrical Engineering, University of Pittsburgh},
month = {April},
year = {1998},
}