Loading Events

PhD Speaking Qualifier


Wen-Hsuan Chu PhD Student Robotics Institute,
Carnegie Mellon University
Tuesday, November 21
1:00 pm to 2:00 pm
NSH 3001
Tracking Any”Thing” in Videos

Being able to track anything is one of the fundamental steps to parse and understand a video. In this talk, I will present two pieces of work that tackle this problem at different spatial granularities. In the first half of the talk, I will discuss tracking any video pixel or particle through time in a class-agnostic fashion using scene transformers. This allows us to reason about the motion of multiple pixels and objects jointly. I will show empirically that this outperforms previous approaches that track pixels independently. In the latter half, I will talk about my ongoing work that tracks any object or part through time by appropriate prompting of large pretrained detectors, segmentors, and optical flow models. We show that, without any finetuning, our approach gives good tracking performance in diverse existing video benchmarks.

Prof. Katerina Fragkiadaki, Chair
Prof. Christopher Atkeson
Prof. David Held
Jason Zhang