VASC Seminar: Alireza Fathi
An Egocentric Paradigm for Understanding Daily Activities
In this talk, I will describe my recent research on modelling, analyzing and understanding daily activities using an egocentric vision system. An egocentric vision system continuously captures the scene in front of the user (first-person view) and may additionally measure the user's gaze direction. A key benefit of egocentric vision is the ability to measure first-person's focus of attention: both regions that are fixated (gaze) and regions that are handled (manipulated objects). First-person's focus of attention provides valuable context for identifying important objects and faces in the scene which is critical for understanding the daily activities. In this talk, I will address two categories of daily activities: Meal preparation activities that are defined by the manipulation of objects, and social interactions that are defined by face-to-face exchanges between multiple people. In case of object-manipulation tasks, I will present a semi-supervised framework for learning object models, and use them to recognize tasks based on hand and object interaction. I will further show that we can learn from human gaze to localize and recognize actions. In case of social interactions, in addition to first-person's attention, we will leverage the attention of other individuals in the scene to model the activity.
Appointments: Kris Kittani
Alireza Fathi is a PhD candidate at the College of Computing at Georgia Tech, graduating in Spring 2013. He has received his bachelors degree from Sharif University of Technology in Iran in 2006, and his MSc degree from Simon Fraser University in Canada in 2008. His main research areas are computer vision and machine learning with particular interest in egocentric (first-person) vision, activity recognition and video segmentation. He has published several papers at CVPR, ICCV and ECCV on recognizing objects and activities in first-person view videos. He has served as co-organizer of 2nd Workshop on Egocentric Vision in conjunction with CVPR 2012.