Loading Events

VASC Seminar

October

9
Mon
Yong Jae Lee Associate Professor Department of Computer Sciences , University of Wisconsin-Madison
Monday, October 9
3:30 pm to 4:30 pm
Newell-Simon Hall 3305
Large Multimodal (Vision-Language) Models for Image Generation and Understanding
Abstract:
Large Language Models and Large Vision Models, also known as Foundation Models, have led to unprecedented advances in language understanding, visual understanding, and AI. In particular, many computer vision problems including image classification, object detection, and image generation have benefited from the capabilities of such models trained on internet-scale text and visual data. In this talk, I’ll focus on our recent work on Large Multimodal (Vision-Language) Models (LMMs) for controllable image generation (GLIGEN) and language-and-vision chatbot assistance (LLaVA). Since training foundation models from scratch can be prohibitively expensive, a key challenge is how to efficiently and effectively adapt and repurpose them to downstream tasks of interest. I’ll provide key insights on how we achieve this, the models’ inner workings, and discuss their limitations and future directions.
Bio:
Yong Jae Lee is an Associate Professor in the Department of Computer Sciences at the University of Wisconsin-Madison. His research interests are in computer vision and machine learning, with a focus on creating robust visual recognition systems that can learn to understand the visual world with minimal human supervision. Before joining UW-Madison in 2021, he spent one year as an AI Visiting Faculty at Cruise, and before that, six years as an Assistant and then Associate Professor at UC Davis. He received his Ph.D. from the University of Texas at Austin in 2012 advised by Kristen Grauman, and was a postdoc at Carnegie Mellon University (2012-2013) and UC Berkeley (2013-2014) advised by Alyosha Efros. He is a recipient of the ARO Young Investigator Program Award (2017), UC Davis Hellman Fellowship (2017), NSF CAREER Award (2018), AWS Machine Learning Research Awards (2018, 2019), Adobe Data Science Research Awards (2019, 2022), UC Davis College of Engineering Outstanding Junior Faculty Award (2019), Sony Focused Research Awards (2020, 2023), and UW-Madison SACM Student Choice Professor of the Year Award (2022). He and his collaborators received the Most Innovative Award at the COCO Object Detection Challenge ICCV 2019 and the Best Paper Award at BMVC 2020.
Homepage:
https://pages.cs.wisc.edu/~yongjaelee/
Sponsored in part by:   Meta Reality Labs Pittsburgh