Estimating Focus of Attention based on Gaze and Sound - Robotics Institute Carnegie Mellon University

Estimating Focus of Attention based on Gaze and Sound

Rainer Stiefelhagen, Jie Yang, and Alex Waibel
Workshop Paper, Workshop on Perceptive User Interfaces (PUI '01), November, 2001

Abstract

Estimating a person's focus of attention is useful for various human-computer interaction applications, such as smart meeting rooms, where a user's goals and intent have to be monitored. In work presented here, we are interested in modeling focus of attention in a meeting situation. We have developed a system capable of estimating participants' focus of attention from multiple cues. We employ an omnidirectional camera to simultaneously track participants' faces around a meeting table and use neural networks to estimate their head poses. In addition, we use microphones to detect who is speaking. The system predicts participants' focus of attention from acoustic and visual information separately, and then combines the output of the audio- and video-based focus of attention predictors. We have evaluated the system using the data from three recorded meetings. The acoustic information has provided 8% error reduction on average compared to using a single modality.

BibTeX

@workshop{Stiefelhagen-2001-8350,
author = {Rainer Stiefelhagen and Jie Yang and Alex Waibel},
title = {Estimating Focus of Attention based on Gaze and Sound},
booktitle = {Proceedings of Workshop on Perceptive User Interfaces (PUI '01)},
year = {2001},
month = {November},
}