VASC Seminar
From Sparse to Dense, and Back to Sparse Again?
Abstract: Computer vision architectures used to be built on a sparse sample of points in the 80s and 90s. In the 2000s, dense models started to become popular for visual recognition as heuristically defined sparse models do not cover all the important parts of an image. However, with deep learning and end-to-end training approaches, this does [...]
Seeing Deep Inside Scattering Tissue Using Efficient, Noise-Robust Wavefront Shaping
Abstract: Scattering limits our ability to see inside biological tissue, as light penetration is severely distorted by tissue components with varying refractive indices. One promising method to overcome scattering aberration is wavefront shaping. This technique involves placing a spatial light modulator (SLM) in the microscope's optical path to correct the wavefront emitted from a point [...]
From Video Generation to Video World Models
Abstract: Video diffusion models have achieved remarkable success in content creation, yet they still fall short of simulating interactive worlds that respond to users in real time. This talk examines the fundamental challenges preventing these models from evolving into true world simulators. I will present a series of works — CausVid, Self-Forcing, MotionStream, and State-Space [...]
What Can We Learn from a Million Models?
Abstract: Machine learning has transformed many fields by learning from large collections of data. Yet, it is rarely applied to its own outputs: the models themselves. Today, with millions of publicly available models, a natural question arises: what can we do with so many models? In this talk, I will motivate two core applications that [...]
Should we skip attention?
Abstract: Transformers are ubiquitous. They influence nearly every aspect of modern AI. However, the mechanics of their training remain poorly understood. This poses a problem for the field due to the immense amounts of data, computational power, and energy being invested in the training of these networks. I highlight a recent intriguing empirical result from [...]
From Lab to Reality: Reliable 3D Vision in the Wild
VIRTUAL SEMINAR Abstract: While deep learning has revolutionized 3D computer vision, a significant gap remains between the performance achieved in controlled laboratory settings and that in complex, uncontrolled real-world environments. This talk addresses the critical challenges of robustness and generalization required to bridge this gap. In this presentation, I will first discuss our contributions to 3D [...]
Nano-optics for smart sensing and display
Abstract: Nano-optical devices provide a new way to control light at the subwavelength scale, enabling optical functionalities beyond conventional optics. By engineering the nanostructures, we can tailor the optical response as a function of space, polarization, wavelength, and angle of incidence -- effectively turning the optical front end into a controllable, programmable physical layer. This [...]
Generative Re-Photography with Video Models
Abstract: I will introduce "generative re-photography" methods that use new generative video models to get more out of your photos—even the blurry ones. First, I will present a method for converting motion-blurred images to video. This method can even predict the "past" and "future" (right before and after the capture) of a motion-blurred image. I will [...]
Learning Through Fitting: Advancing Non-Pixel Representations for Visual Inference
Abstract: Gridded pixel and voxel representations form the backbone of visual computing, but they struggle to scale efficiently to large, high-dimensional data, such as volumetric medical scans and complex scientific simulations. Consequently, continuous, nongridded models such as implicit neural representations (INRs) and Gaussian splatting have gained significant research traction over the past five years. However, [...]
Quanta Perception as Probabilistic Events
Abstract: Autonomous systems ultimately rely on extracting information from light, yet remain brittle in extreme environments, from nighttime navigation to high-speed robotics. This limitation stems from a classical imaging abstraction: conventional sensors integrate photon flux over fixed exposure windows, imposing trade-offs between sensitivity, dynamic range, and temporal resolution that degrade perception when photons are scarce [...]