VASC Seminar: Carl Doersch
Mid-Level Visual Element Discovery as Discriminative Mode Seeking
PhD Student, Machine Learning, Carnegie Mellon University
November 25, 2013, 3:00-4:30PM, NSH 1507
Recent work on mid-level visual representations aims to capture information at the level of complexity higher than typical “visual words”, but lower than full-blown semantic objects. Several approaches have been proposed to discover mid-level visual elements, that are both 1) representative, i.e. frequently occurring within a visual dataset, and 2) visually discriminative. However, the current approaches are rather ad hoc and difficult to analyze and evaluate. In this work, we pose visual element discovery as discriminative mode seeking, drawing connections to the the well-known and well-studied mean-shift algorithm. Given a weakly-labeled image collection, our method discovers visually-coherent patch clusters that are maximally discriminative with respect to the labels. One advantage of our formulation is that it requires only a single pass through the data. We also propose the Purity-Coverage plot as a principled way of experimentally analyzing and evaluating different visual discovery approaches, and compare our method against prior work on the Paris Street View dataset. We also evaluate our method on the task of scene classification, demonstrating state-of-the-art performance on the MIT Scene-67 dataset.
Host: Kris Kitani
Carl Doersch is a PhD student in the Machine Learning Department at CMU, advised by Alyosha Efros and Abhinav Gupta. He holds a bachelor's in computer science/cognitive science and a masters in machine learning, both from CMU. His research focuses on machine learning and computer vision, particularly in the use of weak labels such as geolocation and scene categories to learn complex representations for images.