
Abstract:
Recent years have seen growing interest in developing robots capable of lifelong reliable operation in human-centric environments. Despite impressive recent progress towards long-horizon tasks such as laundry folding, current efforts are predominantly focused on quasi-static tasks in structured settings. General-purpose assistive robots should be capable of performing a wider range of dynamic and dexterous tasks in unstructured environments, safely interact with humans and their environment, and continuously learn from new experiences. This thesis work explores methods at the intersection of modern control theory, robot learning, and multimodal foundation models, with the goal of learning generalizable and safety-aware robot policies for performing complex, dynamic, and interactive tasks in unstructured environments.
In this talk, I will first discuss MResT, an approach for learning generalizable language-conditioned policies for real-time control of precise and dynamic tasks. MResT utilizes frozen vision-language models and fine-tuned lightweight networks to process multiresolution sensory inputs enabling both semantic generalization and real-time control. Next, I will talk about GraphEQA, an approach for utilizing real-time multimodal memory to ground VLM-based planners to perform long-horizon embodied question answering tasks in unseen environments. Finally, I will discuss VLTSafe, a method for tractably learning generalizable safe policies for dynamic manipulation tasks in arbitrarily cluttered environments. VLTSafe leverages pretrained VLMs for test-time semantic constraint specification and optimizes a reach-avoid reinforcement learning objective to learn policies that reason about both long-term safety and task completion.
This thesis takes a step towards the development of generalist embodied agents that integrate the semantic understanding and generalizability of multimodal foundation models with robust, closed-loop control policies that ensure reactivity and safety.
Thesis Committee Members:
Oliver Kroemer (Chair)
Andrea Bajcsy
Aaron Johnson
Zico Kolter
Michael Posa (University of Pennsylvania)