PhD Thesis Defense
Towards Robotic Convoying in Unstructured Environments
Abstract: Multi-agent robotic teaming is the only realistic solution to many large-scale autonomous operations. Conventionally, operations are modeled as a set of tasks that are largely decoupled from each other and the environment at execution time. However, this operational model fails when the successful execution of a task requires multiple agents to synchronize their actions [...]
Learning to Create 3D Content
Abstract: With the popularity of Virtual Reality (VR), Augmented Reality (AR), and other 3D applications, developing methods that let everyday users capture and create their own 3D content has become increasingly essential. However, current 3D creation pipelines often require either tedious manual effort or specialized capture setups. Additionally, resulting assets often suffer from baked-in lighting, [...]
Learning From People: Assistive Robotics and Optimization from Preferences
Abstract: Robotic algorithms rarely come perfectly pre-configured, and when choosing parameters, tradeoffs must often be made: between performance and robustness; efficiency and safety; the comfort of the user and the comfort of bystanders. While engineers can tune parameters by hand or carefully design reward functions to optimize over, this is not always a straightforward task. [...]
Advancing Multimodal Sensing and Robotic Interfaces for Chronic Care
Abstract: The healthcare system prioritizes reactive care for acute illnesses, often overlooking the ongoing needs of individuals with chronic conditions that require long-term management and personalized care. Addressing this gap through technology can empower patients to better manage their conditions, greatly enhancing quality of life and independence. Multimodal sensing, incorporating inertial, acoustic, and vision-based sensors, [...]
Vision-based Human Motion Modeling and Analysis
Abstract: Modern computer vision has achieved remarkable success in tasks such as detecting, segmenting, and estimating human pose in images and videos—often reaching or even surpassing human-level performance. However, significant challenges remain in predicting and analyzing future human motion. This thesis explores how vision-based methods can improve the fidelity and accuracy of human motion modeling [...]
Building richer 3D maps: utilizing a hybrid geometry representation and auxiliary inputs in neural surface reconstruction
Abstract: As robots are increasingly deployed in real-world environments, their perception systems face growing demands. Tasks such as tracking and manipulation require maps with both high spatial fidelity and detailed object-level organization, which must be delivered faster to support timely decision-making and control. Concurrently, advances in vision foundation models allow us to build powerful prediction [...]
Lowering Barriers in Human-Robot Communication
Abstract: For robots to collaborate naturally in homes, they must interpret diverse forms of human expression - visual gestures, natural language instructions, environmental context - and translate them into actions. Existing robot policies typically rely on structured language goals and static visual observations, which restricts both the sensory context and the ways users can specify [...]
3D Video Models through Point Tracking, Reconstructing, and Forecasting
Abstract: This thesis advances 3D video understanding by bridging reconstruction and dynamics forecasting from monocular video, with applications in robotics, autonomy, and immersive environments. We introduce a novel pipeline that translates 2D video into 4D scenes by combining object-centric tracking, learned 2D view synthesis priors, and Gaussian splatting, enabling accurate geometry and motion recovery even [...]
Unified 3D Perception and Generative Control for Generalist Robots
Abstract: To build robot generalists, we need models that can operate across diverse tasks, scenes, and embodiments. While recent efforts scale data and model capacity and incorporate expressive generative objectives, most still rely on 2D inputs to predict inherently 3D actions—introducing a mismatch between perception and control. In my thesis, I explore how unifying 3D [...]
Reachable Sets for Control and Planning: from Reactive Safety to Contact-Rich Manipulation
Abstract: Robots are increasingly deployed in settings where safety, performance, or both are mission-critical—from agile aerial vehicles avoiding collisions at high speed to manipulators executing intricate, contact-rich tasks. In my thesis, I present a unifying approach to these seemingly disparate challenges through the lens of reachable sets, a versatile but underutilized computational primitive in robotics. In [...]