Loading Events

MSR Speaking Qualifier

July

12
Fri
Bhavan Jasani Robotics Institute,
Carnegie Mellon University
Friday, July 12
3:00 pm to 4:30 pm
NSH 4305
Bhavan Jasani – MSR Thesis Talk

Title: Automatic detection of human affective behavior in dyadic conversations

Abstract:

Emotion is communicated through face, voice, and body motion in interpersonal contexts. Yet, most approaches to automatic detection emphasize a single modality (especially face or voice), ignore social context, and focus on well-defined signs of emotion (e.g., smile).  This thesis addresses multimodal, interpersonal emotion detection in the context of dyadic (i.e., two-person) interactions between mothers and their adolescent children. We develop machine-learning approaches based on hand crafted features (e.g., facial action units and head pose) that have proven informative in previous research as well as data driven using deep learning.  We address two challenges for automatic detection of such emotions. Both concern “ground truth” of expert annotation that is used for learning algorithms. One is “latency,” which refers to the offset between when an emotion actually begins in the video and its time stamp in the annotations. When annotators work in real time without stopping to replay video segments, the time required to perceive and tag emotion occurrence creates a latency offset. Lack of consistency between or within annotators, which we refer to as individual differences, is a second source of error in ground truth. Even with training, annotators may disagree in how emotion is defined and when it occurs.  To account for latency and individual differences, we apply our classifier to variable segments proximal to annotation onsets. We experimented with three different settings: 1) different sized windows centered around the onset, 2)  individual left- and right sided widows, and 3) windows temporally shifted away from the annotated onset. We compared classifier performance relative to each setting. Centered windows with no temporal shift resulted in highest accuracy. Accounting for latency and individual differences errors in this way maximized classifier performance.

Committee:
Jeffrey Cohn (adviser)
Laszlo Jeni (co-adviser)
Louis-Philippe Morency
Rohit Girdhar