/Recognizing Visual Signatures of Spontaneous Head Gestures

Recognizing Visual Signatures of Spontaneous Head Gestures

Mohit Sharma, Dragan Ahmetovic, Laszlo A. Jeni and Kris M. Kitani
Conference Paper, IEEE Winter Conf. on Applications of Computer Vision, March, 2018

Download Publication (PDF)

Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author’s copyright. These works may not be reposted without the explicit permission of the copyright holder.


Head movements are an integral part of human nonver- bal communication. As such, the ability to detect various types of head gestures from video is important for robotic systems that need to interact with people or for assistive technologies that may need to detect conversational ges- tures to aid communication. To this end, we propose a novel Multi-Scale Deep Convolution-LSTM architecture, capable of recognizing short and long term motion patterns found in head gestures, from video data of natural and uncon- strained conversations. In particular, our models use Con- volutional Neural Networks (CNNs) to learn meaningful representations from short time windows over head motion data. To capture longer term dependencies, we use Recur- rent Neural Networks (RNNs) that extract temporal patterns across the output of the CNNs. We compare against classi- cal approaches using discriminative and generative graph- ical models and show that our model is able to significantly outperform baseline models.

BibTeX Reference
author = {Mohit Sharma and Dragan Ahmetovic and Laszlo A. Jeni and Kris M. Kitani},
title = {Recognizing Visual Signatures of Spontaneous Head Gestures},
booktitle = {IEEE Winter Conf. on Applications of Computer Vision},
year = {2018},
month = {March},