PhD Thesis Defense
3D Video Models through Point Tracking, Reconstructing, and Forecasting
Abstract: This thesis advances 3D video understanding by bridging reconstruction and dynamics forecasting from monocular video, with applications in robotics, autonomy, and immersive environments. We introduce a novel pipeline that translates 2D video into 4D scenes by combining object-centric tracking, learned 2D view synthesis priors, and Gaussian splatting, enabling accurate geometry and motion recovery even [...]
Unified 3D Perception and Generative Control for Generalist Robots
Abstract: To build robot generalists, we need models that can operate across diverse tasks, scenes, and embodiments. While recent efforts scale data and model capacity and incorporate expressive generative objectives, most still rely on 2D inputs to predict inherently 3D actions—introducing a mismatch between perception and control. In my thesis, I explore how unifying 3D [...]
Reachable Sets for Control and Planning: from Reactive Safety to Contact-Rich Manipulation
Abstract: Robots are increasingly deployed in settings where safety, performance, or both are mission-critical—from agile aerial vehicles avoiding collisions at high speed to manipulators executing intricate, contact-rich tasks. In my thesis, I present a unifying approach to these seemingly disparate challenges through the lens of reachable sets, a versatile but underutilized computational primitive in robotics. In [...]
Dataset-Driven and Generative Approaches to Domain Generalization in Human-Centric Vision
Abstract: Human-centered computer vision technology relies heavily on large, diverse datasets, yet even the largest collections cannot fully capture the variability of human appearance, motion, and viewpoint. At the same time, collecting data from human subjects is time-consuming, labor-intensive, and raises privacy concerns. To overcome these challenges while maintaining efficiency, researchers increasingly turn to two [...]
Influence-Aware Safety for Human-Robot Interaction
Abstract: In recent years, we have seen how influential (and potentially harmful) algorithms can be in our lives through recommender systems and language models; sometimes creating polarization and conspiracies that lead to unsafe behavior. Now that robots are also growing more common in the real world, we must be very careful to ensure that AI-driven [...]
Getting Optimization layers to play well with Deep Networks : Numerical methods and Architectures
Abstract: Many real-world challenges, from robotic control to resource management, can be effectively formulated as optimization problems. Recent advancements have focused on incorporating these optimization problems as layers within deep learning pipelines, enabling the explicit inclusion of auxiliary constraints or cost functions, which is crucial for applications such as enforcing physical laws, ensuring safety constraints, [...]
Robust Incremental Distributed Collaborative Simultaneous Localization and Mapping
Abstract: Multi-robot teams show exceptional promise across applications like Search-and-Rescue, disaster-response, agriculture, forestry, and scientific exploration due to their ability to go where humans cannot, parallelize activity, operate robustly to failures, and expand capabilities beyond that of an individual robot. Collaborative Simultaneous Localization and Mapping (C-SLAM) is a fundamental capability for these multi-robot teams as [...]
Towards 4D perception with foundational priors
Abstract: As humans, we are constantly interacting with and observing a three-dimensional dynamic world. Building this spatiotemporal or 4D understanding in vision algorithms is not straightforward as there is orders of magnitude less 4D data than 2D images and videos. This underscores the need to find meaningful ways to exploit 2D data to realize 4D [...]
Generative Robotics: Self-Supervised Learning for Human-Robot Collaborative Creation
Abstract: Robotic automation is generally welcomed for tasks that are dirty, dull, or dangerous, but with expanding robotic capabilities, robots are entering domains that are safe and enjoyable, such as creative industries. Although there is a widespread rejection of automation in creative fields, many people, from amateurs to professionals, would welcome supportive or collaborative creative [...]
Embodied Artificial Intelligence for Emergency Care in Unstructured Environments
Abstract: In mass casualty events and resource-constrained scenarios, limited responder capacity leads to preventable deaths. Time is of the essence particularly in severe trauma: the sooner individuals receive care, the higher their chances of survival. Yet a single responder can only manage a few patients simultaneously, leaving others unattended. This thesis addresses this capacity constraint [...]