VIRTUAL SEMINAR
Abstract: While deep learning has revolutionized 3D computer vision, a significant gap remains between the performance achieved in controlled laboratory settings and that in complex, uncontrolled real-world environments. This talk addresses the critical challenges of robustness and generalization required to bridge this gap. In this presentation, I will first discuss our contributions to 3D reconstruction, including robust multi-view reconstruction, physically grounded 3D shape generation, and 3D Gaussian Splatting under sparse-view conditions. Next, I will cover 3D interaction with a focus on generalizable object pose estimation. I will demonstrate how leveraging different types of reference information can facilitate pose estimation for previously unseen objects in uncontrolled environments. Finally, I will conclude by outlining future directions toward multi-modal 3D understanding, unified 3D representations, and the development of 3D foundation models.
Bio: Chen Zhao is a Postdoctoral Research Fellow at the Computer Vision Lab, EPFL, working with Dr. Mathieu Salzmann and Prof. Pascal Fua. Earlier, he was a PhD candidate at EPFL, supervised by Dr. Mathieu Salzmann and Prof. Pascal Fua. His research interests lie in 3D computer vision, with a specific focus on 3D reconstruction, 3D interaction, and 3D understanding.
Homepage: https://sailor-z.github.io/
Sponsor: The VASC seminar is generously sponsored by HeyGen, an all-in-one AI-powered video generation platform that leverages advances in computer vision, generative modeling, and multimodal learning to make high-quality video creation both scalable and accessible.
