Parameter-Efficient Neuro-Symbolic Action Anticipation via Iterative Context Refinement - Robotics Institute Carnegie Mellon University
Loading Events

MSR Thesis Defense

May

27
Tue
Aryan MSR Student Robotics Institute,
Carnegie Mellon University
Tuesday, May 27
12:00 pm to 1:00 pm
NSH 3002
Parameter-Efficient Neuro-Symbolic Action Anticipation via Iterative Context Refinement

Abstract:

As robots and intelligent systems increasingly interact with humans, the ability to understand users by anticipating their actions becomes significantly more important. Current approaches to action anticipation leverage the inference capabilities of large foundational models but are limited in their application by the complexity and resource requirement, as well as the difficulty of training. This thesis introduces a novel neuro- symbolic approach, SymAnt, that unifies the expressiveness of neural networks with symbolic knowledge in the form of knowledge and scene graphs as context to effectively anticipate actions from short videos. By leveraging symbolic context, our approach significantly reduces model size as well as the need for extensive contextual observations. We present a joint-graph search approach that utilizes scene and knowledge graphs to reason over objects’ spatial relationships as well as their attributes, affordances, and actions, allowing for action predictions with less context. Inspired by diffusion models, we propose an iterative action denoising approach that uses symbolic knowledge as initialization and refines the predicted sequence of future actions to improve accuracy, coherence, and plausibility. Through our experiments, we demonstrate the effectiveness of our neuro-symbolic approach, outperforming current state-of-the-art methods by up to 5% on a set of four diverse datasets, including Breakfast, 50 Salads, EPIC Kitchens, and EGTEA Gaze+ while reducing the model size by over 98% compared to the best neural-only baseline.

Committee:
Prof. Katia Sycara (advisor)
Prof. Deva Ramanan
Himangi Mittal