Past Events from June 2, 2026 – January 20, 2017 › Seminar › VASC Seminar › – Robotics Institute Carnegie Mellon University
2026-06-02T00:00:00-04:00
  • VASC Seminar
    Miguel Angel Bautista
    Staff Research Scientist
    Apple Machine Learning Research

    Generative modeling: from 3D scenes to fields and manifold

    Newell-Simon Hall 3305

    Abstract: In this keynote talk, we delve into some of our progress on generative models that are able to capture the distribution of intricate and realistic 3D scenes and fields. We explore a formulation of generative modeling that optimizes latent representations for disentangling radiance fields and camera poses, enabling both unconditional and conditional generation of 3D [...]

  • VASC Seminar
    Shervin Ardeshir
    Senior Research Scientist
    Netflix

    Estimating Robustness using Proxies

    Newell-Simon Hall 3305

    ABSTRACT: This talk covers some of our recent explorations on estimating the robustness of black-box machine learning models across data subpopulations. In other words, if a trained model is uniformly accurate across different types of inputs, or if there are significant performance disparities affecting the different subpopulations. Measuring such a characteristic is fairly straightforward if [...]

    VASC Seminar
    Or Patashnik
    PhD student
    Tel-Aviv University

    Latent-NeRF for Shape-Guided Generation of 3D Shapes and Textures

    Newell-Simon Hall 3305

    Abstract: In this talk, I will focus on presenting my recent work which will be presented at CVPR in less than two months. Text-guided image generation has progressed rapidly in recent years, inspiring major breakthroughs in text-guided shape generation. Recently, it has been shown that using score distillation, one can successfully text-guide a NeRF model to [...]

    VASC Seminar

    Navigating to Objects in the Real World

    3305 Newell-Simon Hall

    Abstract: Semantic navigation is necessary to deploy mobile robots in uncontrolled environments like our homes, schools, and hospitals. Many learning-based approaches have been proposed in response to the lack of semantic understanding of the classical pipeline for spatial navigation, which builds a geometric map using depth sensors and plans to reach point goals. Broadly, end-to-end [...]

    VASC Seminar
    Vineeth N Balasubramanian
    Associate Professor
    Department of Computer Science and Engineering, Indian Institute of Technology, Hyderabad

    Going Beyond Continual Learning: Towards Organic Lifelong Learning

    3305 Newell-Simon Hall

    Abstract: Supervised learning, the harbinger of machine learning over the last decade, has had tremendous impact across application domains in recent years. However, the notion of a static trained machine learning model is becoming increasingly limiting, as these models are deployed in changing and evolving environments. Among a few related settings, continual learning has gained significant [...]

  • VASC Seminar
    Santhosh Kumar Ramakrishnan
    Ph.D. Candidate
    University of Texas at Austin

    Predictive Scene Representations for Embodied Visual Search

    GHC 6501

    Abstract:  My research advances embodied AI by developing large-scale datasets and state-of-the-art algorithms. In my talk, I will specifically focus on the embodied visual search problem, which aims to enable intelligent search for robots and augmented reality (AR) assistants. Embodied visual search manifests as the visual navigation problem in robotics, where a mobile agent must efficiently navigate [...]

  • VASC Seminar
    Aayush Bansal
    Startup

    Generating Beautiful Pixels

    Newell-Simon Hall 3305

    Abstract: In this talk, I will present three experiments that use low-level image statistics to generate high-resolution detailed outputs. In the first experiment, I will use 2D pixels to efficiently mine hard examples for better learning. Simply biasing ray sampling towards hard ray examples enables learning of neural fields with more accurate high-frequency detail in less [...]

    VASC Seminar
    Viraj Prabhu
    CS PhD Student
    Georgia Institute of Technology

    Towards Reliable Computer Vision Systems

    Newell-Simon Hall 3305

    Abstract:  The real world has infinite visual variation – across viewpoints, time, space, and curation. As deep visual models become ubiquitous in high-stakes applications, their ability to generalize across such variation becomes increasingly important. In this talk, I will present opportunities to improve such generalization at different stages of the ML lifecycle: first, I will [...]

    VASC Seminar
    Bharath Hariharan
    Assistant Professor
    Cornell University

    Vision without labels

    3305 Newell-Simon Hall

    Abstract: Deep learning has revolutionized all aspects of computer vision, but its successes have come from supervised learning at scale: large models trained on ever larger labeled datasets. However this reliance on labels makes these systems fragile when it comes to new scenarios or new tasks where labels are unavailable. This is in stark contrast to [...]

  • VASC Seminar
    Yong Jae Lee
    Associate Professor
    Department of Computer Sciences , University of Wisconsin-Madison

    Large Multimodal (Vision-Language) Models for Image Generation and Understanding

    Newell-Simon Hall 3305

    Abstract: Large Language Models and Large Vision Models, also known as Foundation Models, have led to unprecedented advances in language understanding, visual understanding, and AI. In particular, many computer vision problems including image classification, object detection, and image generation have benefited from the capabilities of such models trained on internet-scale text and visual data. In [...]

    VASC Seminar
    Mohamed Elhoseiny
    Assistant Professor
    Computer Science, KAUST

    Imaginative Vision Language Models: Towards human-level imaginative AI skills transforming species discovery, content creation, self-driving cars, and emotional health

    3305 Newell-Simon Hall

    Abstract:   Most existing AI learning methods can be categorized into supervised, semi-supervised, and unsupervised methods. These approaches rely on defining empirical risks or losses on the provided labeled and/or unlabeled data. Beyond extracting learning signals from labeled/unlabeled training data, we will reflect in this talk on a class of methods that can learn beyond the vocabulary [...]

    VASC Seminar
    Kenneth Marino
    Research Scientist
    Google DeepMind

    World Knowledge in the Time of Large Models

    Newell-Simon Hall 3305

    Abstract:  This talk will discuss the massive shift that has come about in the vision and ML community as a result of the large pre-trained language and language and vision models such as Flamingo, GPT-4, and other models. We begin by looking at the work on knowledge-based systems in CV and robotics before the large model [...]

    VASC Seminar
    Shunsuke Saito
    Research Scientist
    Meta Reality Labs Research

    Digital Human Modeling with Light

    Newell-Simon Hall 3305

    Abstract: Leveraging light in various ways, we can observe and model physical phenomena or states which may not be possible to observe otherwise. In this talk, I will introduce our recent exploration on digital human modeling with different types of light. First, I will present our recent work on the modeling of relightable human heads, [...]

  • VASC Seminar
    Jonathon Luiten
    Postdoctoral Fellow
    RWTH Aachen and Carnegie Mellon University

    Dynamic 3D Gaussians: Tracking by Persistent Dynamic View Synthesis

    Newell-Simon Hall 3305

    Abstract: We present a method that simultaneously addresses the tasks of dynamic scene novel-view synthesis and six degree-of-freedom (6-DOF) tracking of all dense scene elements. We follow an analysis-by-synthesis framework, inspired by recent work that models scenes as a collection of 3D Gaussians which are optimized to reconstruct input images via differentiable rendering. To model [...]

    VASC Seminar
    Arun Ross
    Professor
    Michigan State University

    Biometrics in a Deep Learning World

    Newell-Simon Hall 3305

    Abstract: Biometrics is the science of recognizing individuals based on their physical and behavioral attributes such as fingerprints, face, iris, voice and gait. The past decade has witnessed tremendous progress in this field, including the deployment of biometric solutions in diverse applications such as border security, national ID cards, amusement parks, access control, and smartphones. [...]

    VASC Seminar
    Andrea Tagliasacchi
    Associate Professor
    Simon Fraser University

    Neural World Models

    Newell-Simon Hall 4305

    Abstract: Computer vision researchers have pushed the limits of performance in perception tasks involving natural images to near saturation. With self-supervised inference driven by recent advancements in generative modeling, it can be debated that the era of large image models is coming to a close, ushering in an era focused on video. However, it's worth [...]

  • VASC Seminar
    Ce Zheng
    Ph.D. candidate at Center for Research in Computer Vision
    University of Central Florida

    Reconstructing 3D Humans from Visual Data

    Newell-Simon Hall 3305

    Abstract:  Abstract: Understanding humans in visual content is fundamental for numerous computer vision applications. Extensive research has been conducted in the field of human pose estimation (HPE) to accurately locate joints and construct body representations from images and videos. Expanding on HPE, human mesh recovery (HMR) addresses the more complex task of estimating the 3D pose [...]

  • VASC Seminar
    Zhenglun Kong
    Ph.D. in the Department of Electrical and Computer Engineering
    Northeastern University

    Towards Energy-Efficient Techniques and Applications for Universal AI Implementation

    Newell-Simon Hall 3305

    Abstract: The rapid advancement of large-scale language and vision models has significantly propelled the AI domain. We now see AI enriching everyday life in numerous ways – from community and shared virtual reality experiences to autonomous vehicles, healthcare innovations, and accessibility technologies, among others. Central to these developments is the real-time implementation of high-quality deep [...]

  • VASC Seminar
    Shengjie Zhu
    Ph.D. Student
    Michigan State University

    Structure-from-Motion Meets Self-supervised Learning

    Newell-Simon Hall 3305

    Abstract: How to teach machine to perceive 3D world from unlabeled videos? We will present new solution via incorporating Structure-from-Motion (SfM) into self-supervised model learning. Given RGB inputs, deep models learn to regress depth and correspondence. With the two inputs, we introduce a camera localization algorithm that searches for certified global optimal poses. However, the [...]

    VASC Seminar
    Qi Sun
    Assistant Professor
    New York University

    Toward Human-Centered XR: Bridging Cognition and Computation

    Newell-Simon Hall 3305

    Abstract:   Virtual and Augmented Reality enables unprecedented possibilities for displaying virtual content, sensing physical surroundings, and tracking human behaviors with high fidelity. However, we still haven't created "superhumans" who can outperform what we are in physical reality, nor a "perfect" XR system that delivers infinite battery life or realistic sensation. In this talk, I will discuss some of our [...]

    VASC Seminar
    Yanxi Liu
    Professor
    Penn State University

    Zeros for Data Science

    Newell-Simon Hall 3305

    Abstract: The world around us is neither totally regular nor completely random. Our and robots’ reliance on spatiotemporal patterns in daily life cannot be over-stressed, given the fact that most of us can function (perceive, recognize, navigate) effectively in chaotic and previously unseen physical, social and digital worlds. Data science has been promoted and practiced [...]

    VASC Seminar
    Agata Lapedriza
    Principal Research Scientist/Professor
    Northeastern University

    Emotion perception: progress, challenges, and use cases

    Newell-Simon Hall 3305

    Abstract: One of the challenges Human-Centric AI systems face is understanding human behavior and emotions considering the context in which they take place. For example, current computer vision approaches for recognizing human emotions usually focus on facial movements and often ignore the context in which the facial movements take place. In this presentation, I will [...]

  • VASC Seminar
    Yunzhu Li
    Assistant Professor
    University of Illinois Urbana-Champaign

    Foundation Models for Robotic Manipulation: Opportunities and Challenges

    Newell-Simon Hall 3305

    Abstract: Foundation models, such as GPT-4 Vision, have marked significant achievements in the fields of natural language and vision, demonstrating exceptional abilities to adapt to new tasks and scenarios. However, physical interaction—such as cooking, cleaning, or caregiving—remains a frontier where foundation models and robotic systems have yet to achieve the desired level of adaptability and [...]

  • VASC Seminar
    Luca Weihs
    Research Manager
    Allen Institute for AI

    Imitating Shortest Paths in Simulation Enables Effective Navigation and Manipulation in the Real World

    Newell-Simon Hall 3305

    Abstract: We show that imitating shortest-path planners in simulation produces Stretch RE-1 robotic agents that, given language instructions, can proficiently navigate, explore, and manipulate objects in both simulation and in the real world using only RGB sensors (no depth maps or GPS coordinates). This surprising result is enabled by our end-to-end, transformer-based, SPOC architecture, powerful [...]

    VASC Seminar
    Vishnu Lokhande
    Assistant Professor
    University at Buffalo, SUNY

    Creating robust deep learning models involves effectively managing nuisance variables

    Newell-Simon Hall 3305

    Abstract: Over the past decade, we have witnessed significant advances in capabilities of deep neural network models in vision and machine learning. However, issues related to bias, discrimination, and fairness in general, have received a great deal of negative attention (e.g., mistakes in surveillance and animal-human confusion of vision models). But bias in AI models [...]

    VASC Seminar
    Mohit Gupta
    Associate Professor
    University of Wisconsin-Madison

    Shedding Light on 3D Cameras

    Newell-Simon Hall 3305

    Abstract: The advent (and commoditization) of low-cost 3D cameras is revolutionizing many application domains, including robotics, autonomous navigation, human computer interfaces, and recently even consumer devices such as cell-phones. Most modern 3D cameras (e.g., LiDAR) are active; they consist of a light source that emits coded light into the scene, i.e., its intensity is modulated over [...]

    VASC Seminar
    Ilya Chugunov
    PhD Candidate
    Computational Imaging Lab, Princeton University

    Neural Field Representations of Mobile Computational Photography

    Newell-Simon Hall 3305

    Abstract: Burst imaging pipelines allow cellphones to compensate for less-than-ideal optical and sensor hardware by computationally merging multiple lower-quality images into a single high-quality output. The main challenge for these pipelines is compensating for pixel motion, estimating how to align and merge measurements across time while the user's natural hand tremor involuntarily shakes the camera. In [...]

  • VASC Seminar
    Mian Wei
    PhD Candidate
    University of Toronto

    Passive Ultra-Wideband Single-Photon Imaging

    3305 Newell-Simon Hall

    Abstract: High-speed light sources, fast cameras, and depth sensors have made it possible to image dynamic phenomena occurring in ever smaller time intervals with the help of actively-controlled light sources and synchronization. Unfortunately, while these techniques do capture ultrafast events, they cannot simultaneously capture slower ones too. I will discuss our recent work on passive ultra-wideband [...]

  • VASC Seminar
    Angela Dai
    Associate Professor
    The Technical University Munich

    From Understanding to Interacting with the 3D World

    1305 Newell Simon Hall

    Abstract: Understanding the 3D structure of real-world environments is a fundamental challenge in machine perception, critical for applications spanning robotic navigation, content creation, and mixed reality scenarios. In recent years, machine learning has undergone rapid advancements; however, in the 3D domain, such data-driven learning is often very challenging under limited 3D/4D data availability. In this talk, [...]

    VASC Seminar
    Wolfgang Heidrich
    Professor of Computer Science and Electrical and Computer Engineering
    KAUST Visual Computing Center

    Learned Imaging Systems

    Newell-Simon Hall 4305

    Abstract: Computational imaging systems are based on the joint design of optics and associated image reconstruction algorithms. Of particular interest in recent years has been the development of end-to-end learned “Deep Optics” systems that use differentiable optical simulation in combination with backpropagation to simultaneously learn optical design and deep network post-processing for applications such as hyperspectral [...]

  • VASC Seminar
    Nataniel Ruiz
    Research Scientist
    Google

    Unlocking Magic: Personalization of Diffusion Models for Novel Applications

    3305 Newell-Simon Hall

    Abstract: Since the recent advent of text-to-image diffusion models for high-quality realistic image generation, a plethora of creative applications have suddenly become within reach. I will present my work at Google where I have attempted to unlock magical applications by proposing simple techniques that act on these large text-to-image diffusion models. Particularly, a large class of [...]

    VASC Seminar
    Yingsi Qin
    PhD Candidate
    Carnegie Mellon University

    Instant Visual 3D Worlds Through Split-Lohmann Displays

    3305 Newell-Simon Hall

    Abstract: Split-Lohmann displays provide a novel approach to creating instant visual 3D worlds that support realistic eye accommodation. Unlike commercially available VR headsets that show content at a fixed depth, the proposed display can optically place each pixel region to a different depth, instantly creating eye-tracking-free 3D worlds without using time-multiplexing. This enables real-time streaming [...]

    VASC Seminar
    Edward Lu
    PhD student
    ECE Department at CMU

    Remote Rendering and 3D Streaming for Resource-Constrained XR Devices

    3305 Newell-Simon Hall

    Abstract: An overview of the motivation and challenges for remote rendering and real-time 3D video streaming on XR headsets. Bio: Edward is a third year PhD student in the ECE department interested in computer systems for VR/AR devices. Homepage: https://users.ece.cmu.edu/~elu2/   Sponsored in part by:   Meta Reality Labs Pittsburgh      

    VASC Seminar
    Mosam Dabhi
    PhD Student
    Carnegie Mellon University

    Vectorizing Raster Signals for Spatial Intelligence

    3305 Newell-Simon Hall

    Abstract: This seminar will focus on how vectorized representations can be generated from raster signals to enhance spatial intelligence. I will discuss the core methodology behind this transformation, with a focus on applications in AR/VR and robotics. The seminar will also briefly cover follow-up work that explores rigging and re-animating objects from casual single videos [...]

    VASC Seminar
    Bailey Miller
    PhD Candidate
    Carnegie Mellon University

    Stochastic Graphics Primitives

    3305 Newell-Simon Hall

    Abstract: For decades computer graphics has successfully leveraged stochasticity to enable both expressive volumetric representations of participating media like clouds and efficient Monte Carlo rendering of large scale, complex scenes. In this talk, we’ll explore how these complementary forms of stochasticity (representational and algorithmic) may be applied more generally across computer graphics and vision. In [...]

  • VASC Seminar
    Noah Snavely
    Professor & Research Scientist
    Cornell Tech & Google DeepMind

    Reconstructing Everything

    3305 Newell-Simon Hall

    Abstract: The presentation will be about a long-running, perhaps quixotic effort to reconstruct all of the world's structures in 3D from Internet photos, why this is challenging, and why this effort might be useful in the era of generative AI.   Bio: Noah Snavely is a Professor in the Computer Science Department at Cornell University [...]

    VASC Seminar
    Christian Richardt
    Research Scientist Lead
    Meta Reality Labs Research

    High-Fidelity Neural Radiance Fields

    3305 Newell-Simon Hall

    Abstract: I will present three recent projects that focus on high-fidelity neural radiance fields for walkable VR spaces: VR-NeRF (SIGGRAPH Asia 2023) is an end-to-end system for the high-fidelity capture, model reconstruction, and real-time rendering of walkable spaces in virtual reality using neural radiance fields. To this end, we designed and built a custom multi-camera rig to [...]

    VASC Seminar
    Saining Xie
    Assistant Professor
    Courant Institute of Mathematical Sciences, New York University

    Building Scalable Visual Intelligence: From Represention to Understanding and Generation

    3305 Newell-Simon Hall

    Abstract: In this talk, we will dive into our recent work on vision-centric generative AI, focusing on how it helps with understanding and creating visual content like images and videos. We'll cover the latest advances, including multimodal large language models for visual understanding and diffusion transformers for visual generation. We'll explore how these two areas [...]

    VASC Seminar
    Qitao Zhao
    Master's Student
    Computer Vision, Carnegie Mellon University

    Sparse-view Pose Estimation and Reconstruction via Analysis by Generative Synthesis

    3305 Newell-Simon Hall

    Abstract:  This talk will present our approach for reconstructing objects from sparse-view images captured in unconstrained environments. In the absence of ground-truth camera poses, we will demonstrate how to utilize estimates from off-the-shelf systems and address two key challenges: refining noisy camera poses in sparse views and effectively handling outlier poses.   Bio:  Qitao is a second-year [...]

    VASC Seminar
    Vimal Mollyn
    PhD Student
    Human Computer Interaction Institute, Carnegie Mellon University

    EgoTouch: On-Body Touch Input Using AR/VR Headset Cameras

    3305 Newell-Simon Hall

    Abstract:  In augmented and virtual reality (AR/VR) experiences, a user’s arms and hands can provide a convenient and tactile surface for touch input. Prior work has shown on-body input to have significant speed, accuracy, and ergonomic benefits over in-air interfaces, which are common today. In this work, we demonstrate high accuracy, bare hands (i.e., no special [...]

    VASC Seminar
    Hyunsung Cho
    Ph.D. Student
    Human-Computer Interaction Institute (HCII) , Carnegie Mellon University

    Auptimize: Optimal Placement of Spatial Audio Cues for Extended Reality

    3305 Newell-Simon Hall

    Abstract:  Spatial audio in Extended Reality (XR) provides users with better awareness of where virtual elements are placed, and efficiently guides them to events such as notifications, system alerts from different windows, or approaching avatars. Humans, however, are inaccurate in localizing sound cues, especially with multiple sources due to limitations in human auditory perception such as [...]

  • VASC Seminar
    Srinath Sridhar
    Assistant Professor
    Computer Science, Brown University

    Generative Modelling for 3D Multimodal Understanding of Human Physical Interactions

    3305 Newell-Simon Hall

    Abstract: Generative modelling has been extremely successful in synthesizing text, images, and videos. Can the same machinery also help us better understand how to physically interact with the multimodal 3D world? In this talk, I will introduce some of my group's work in answering this question. I will first discuss how we can enable 2D [...]

    VASC Seminar
    Dr. Yin Yang
    Associate Professor
    Kahlert School of Computing, University of Utah

    High-resolution cloth simulation in milliseconds: Efficient GPU Cloth Simulation with Non-distance Barriers and Subspace Reuse Interactions

    3305 Newell-Simon Hall

    Abstract: We show how to push the performance of high-resolution cloth simulation, making the simulation interactive (in milliseconds) for models with one million degrees of freedom (DOFs) while keeping every triangle untangled. The guarantee of being penetration-free is inspired by the interior-point method, which converts the inequality constraints to barrier potentials. Nevertheless, we propose a [...]

  • VASC Seminar
    Jiaqi Ma
    Assistant Professor
    University of Illinois Urbana-Champaign

    Practical Challenges and Recent Advances in Data Attribution

    3305 Newell-Simon Hall

    Abstract: Data plays an increasingly crucial role in both the performance and the safety of AI models. Data attribution is an emerging family of techniques aimed at quantifying the impact of individual training data points on a model trained on them, which has found data-centric applications such as training data curation, instance-based explanation, and copyright [...]

  • VASC Seminar
    Jia-Bin Huang
    Capital One-endowed Associate Professor
    University of Maryland College Park

    Controllable Visual Imagination

    3305 Newell-Simon Hall

    Abstract: Generative models have empowered human creators to visualize their imaginations without artistic skills and labor. A prominent example is large-scale text-to-image generation models. However, these models often are difficult to control and do not respect 3D perspective geometry and temporal consistency of videos. In this talk, I will showcase several of our recent efforts to [...]

    VASC Seminar
    Niv Cohen
    Research Scientist
    New York University

    Discovering and Erasing Undesired Concepts

    3305 Newell-Simon Hall

    Abstract: The rapid growth of generative models allows an ever-increasing variety of capabilities. Yet, these models may also produce undesired content such as unsafe or misleading images, private information, or copyrighted material. In this talk, I will discuss practical methods to prevent undesired generations. First, I will show how the challenge of avoiding undesired generations [...]

  • VASC Seminar
    Dr. Rong Yan
    CTO
    HeyGen

    The New Era of Video Generation

    Newell-Simon Hall 4305

    Abstract: Traditional video production is slow, expensive, and requires specialized skills. Founded by CMU alumni, HeyGen is an AI-native video platform designed to revolutionize the video creation process by making visual storytelling accessible to all. We've successfully grown to more than 20M users, and tens of millions revenue in less than one year, with recognition [...]

    VASC Seminar
    Kaiming He
    Associate Professor
    Department of Electrical Engineering and Computer Science, MIT-Massachusetts Institute of Technology

    Autoregressive Models: Foundations and Open Questions

    Abstract: The success of Autoregressive (AR) models in language today is so tremendous that their scope has, in turn, been largely narrowed to specific instantiations. In this talk, we will revisit the foundations of classical AR models, discussing essential concepts that may have been overlooked in modern practice. We will then introduce our recent research [...]

  • VASC Seminar
    Hong-Xing “Koven” Yu
    PhD candidate
    Computer Science Department , Stanford University

    Generating a Physical World

    3305 Newell-Simon Hall

    Abstract:  Generating an interactive, enlivened, and physical world enables a wide range of applications in entertainment, embodied AI, education, and creative designs. Recent image/video models have shown promise in producing realistic visuals, yet they operate purely at the pixel level and lack underlying physical grounding, leading to failures in physical fidelity and user interactivity. In [...]

  • VASC Seminar
    David Chu
    VP of Spatial Computing and XR
    NVIDIA

    When Spatial Computing meets Accelerated Computing

    3305 Newell-Simon Hall

    Abstract:  NVIDIA has been pioneering Accelerated Computing for the past three decades, driving innovations that have transformed society. Among all personal computing mediums, Spatial Computing and Extended Reality (XR) stand out as some of the most promising beneficiaries of accelerated computing. In this talk, we will explore the latest developments and trends in the XR ecosystem, [...]