VASC Seminar: Larry Zitnick
Exploring the semantic understanding of abstract scenes
Senior Researcher, Microsoft Research
November 19, 2012
The semantic understanding of images depends on the presence of objects, their attributes and their relations to other objects. Extracting this complex visual information from an image is in general a difficult, and yet still unsolved problem. In this talk, I propose studying semantic information in abstract images created from collections of clip art. Abstract images provide several advantages. They allow high-level semantic information to be directly studied, since they remove the reliance on noisy low-level object, attribute and relation detectors, or on the tedious hand-labeling of images. Importantly, the use of abstract images also enables the ability to generate sets of semantically similar scenes that would be nearly impossible with real images. We create 1,002 sets of 10 semantically similar scenes with corresponding written descriptions. We thoroughly analyze this dataset to discover semantically important features, the relations of words to visual features, and the measurement of semantic similarity. Our results show abstract images have significant potential for providing new insights into computer vision, NLP and related fields.
Host: Abhinav Gupta
Appointments: Yong Jae Lee (firstname.lastname@example.org)
C. Lawrence Zitnick received the PhD degree in robotics from Carnegie Mellon University in 2003. His thesis focused on a maximum entropy approach to efficient inference. Previously, his work centered on stereo vision, including the development of a commercial portable 3D camera. Currently, he is a senior researcher at the Interactive Visual Media group at Microsoft Research where he is exploring object recognition and computational photography.