When Spatial Computing meets Accelerated Computing
Abstract: NVIDIA has been pioneering Accelerated Computing for the past three decades, driving innovations that have transformed society. Among all personal computing mediums, Spatial Computing and Extended Reality (XR) stand out as some of the most promising beneficiaries of accelerated computing. In this talk, we will explore the latest developments and trends in the XR ecosystem, [...]
From Pixels to Physical Intelligence: Semantic 3D Data Generation at Internet Scale
Abstract: Modern AI won’t achieve physical intelligence until it can extract rich, semantic spatial knowledge from the wild ocean of internet video—not just curated motion-capture datasets or expensive 3D scans. This thesis proposes a self-bootstrapping pipeline for converting raw pixels into large-scale 3D and 4D spatial understanding. It begins with multi-view bootstrapping: using just two [...]
Self supervised perception for Tactile Dexterity
Abstract: Humans are incredibly dexterous. We interact with and manipulate tools effortlessly, leveraging touch without giving it a second thought. Yet, replicating this level of dexterity in robots, is a major challenge. While the robotics community, recognizing the importance of touch in fine manipulation, has developed a wide variety of tactile sensors, how best to [...]
Prompt-to-Product: Generative Assembly via Bimanual Manipulation
Abstract: Assembly products are ubiquitous in our lives, for example, chairs, tables, couches, drawers, and more. Due to the complex interactions between components, creating such products typically demands significant manual effort in 1) designing the assembly and 2) constructing the product. This thesis seeks to reduce the required manual effort by automating the creation process [...]
Differentiable Probabilistic Inference and Rendering for Multimodal Robotic Perception
Abstract: Robots are increasingly deployed to automate tasks that are dangerous or mundane for humans such as search and rescue, mapping, and inspection in difficult environments. They rely on their perception stack, typically composed of complementary sensing modalities, to estimate their own state and the state of the environment to enable informed decision-making. This thesis [...]
Watch, Predict, Act: Robot Learning meets Web Videos
Abstract: To enable robots to assist in everyday tasks in diverse natural environments such as homes, offices, and kitchens, it is critical to develop policies that generalize to novel tasks in unseen scenarios. Practical utility demands that these policies do not require task-specific adaptation at test time but can instead execute directly given a natural [...]
Towards Robust Informative Path Planning for Spatiotemporal Environment Prediction
Abstract: Informative Path Planning (IPP) is an important planning paradigm for various real-world robotic applications such as wildfire monitoring and predicting infection spread in crops. IPP involves planning a path that can learn an accurate belief of the quantity of interest, while adhering to planning constraints. Traditional IPP methods are effective only in static, time-invariant [...]
Semantics-Driven Perception and Manipulation for Agricultural Robotics
Abstract: With growing expectations for autonomous robot deployment in unstructured, real-world environments, these systems must operate efficiently while perceiving and interpreting complex scenes to navigate dynamic, cluttered conditions. Robust performance in these settings require handling occlusions, clutter, and ambiguous visual cues; challenges exacerbated by the limited semantic understanding in standard visuomotor policy frameworks. This thesis [...]
Towards Dexterous Robotic Manipulation by Imitating Experts
Abstract: Imitation learning enables scalable transfer of complex manipulation skills to robots, but its effectiveness depends on high-quality demonstrations and robust policy learning, especially in dynamic, contact-rich environments. This thesis investigates how combining imitation learning with teleoperation and classical planners can teach dexterous manipulation across diverse real-world settings. We develop a teleoperation system for collecting [...]
Unified Predictive Representations for Generalized Robotic Perception
Abstract: Building robots that can perceive, reason, and act across a wide range of objects and environments remains a central goal in robotics. To achieve such generalization without relying on large amounts of task-specific data, predicting future outcomes in response to actions is a core capability towards generalized robotics. In this thesis, we investigate how to [...]