Loading Events

PhD Thesis Proposal

July

7
Wed
Mengtian Li Robotics Institute,
Carnegie Mellon University
Wednesday, July 7
12:00 pm to 1:00 pm
Resource-Constrained Learning and Inference for Visual Perception

Abstract:
Real-world applications usually require computer vision algorithms to meet certain resource constraints. In this talk, I will present evaluation methods and principled solutions for both training and testing. First, I will talk about a formal setting for studying training under the non-asymptotic, resource-constrained regime, i.e., budgeted training. We analyze the following problem: “given a dataset, algorithm, and fixed resource budget, what is the best achievable performance?” Such a setting could be essential for the democratization of deep learning. Second, I will talk about how vision algorithms should respond to resource constraints inherent in embodied perception, where an autonomous agent needs to perceive its environment and (re)act in time. We introduce a meta-benchmark that systematically converts any single-frame understanding task into a streaming understanding task. Such streaming perception framework yields several surprising conclusions and solutions. Third, I will talk about an unconventional approach for streaming object detection. Image downsampling is a commonly adopted technique to ensure the latency constraint is met. However, this naive approach greatly restricts an object detector’s capability to identify small objects. Inspired by the foveated human vision, we elastically magnify certain regions while maintaining a small input canvas. With attentional magnification, we set a new record for streaming AP on Argoverse-HD.

More Information

Thesis Committee Members:
Deva Ramanan, Chair
Martial Hebert
Mahadev Satyanarayanan
Raquel Urtasun, Waabi & University of Toronto
Ross Girshick, Facebook AI Research