PLAI Talk: Generative AI, Stable Diffusion, and the Revolution in Visual Synthesis - Björn Ommer, Professor, LMU
DATE: Fri, June 23, 2023 - 10:00 am
LOCATION: UBC Vancouver Campus, ICCS X836
Recently, deep generative modeling has become the most prominent paradigm for learning powerful representations of our (visual) world and for generating novel samples thereof. Consequently, this is now serving as the main building block of numerous approaches and practical applications. This talk will contrast the most commonly used generative models to date with a particular focus on denoising diffusion probabilistic models, the core of the currently leading approaches to visual synthesis. Despite their enormous potential, these models come with their own specific limitations. We will then discuss a solution, latent diffusion models a.k.a. "Stable Diffusion", that significantly improves the efficiency of diffusion models. Now billions of training samples can be summarized in compact representations of just a few gigabyte so that the approach runs on consumer hardware. Making high-quality visual synthesis accessible to everyone has revolutionized the way we create visual content and spurred research and the development of numerous novel applications.
We will then discuss recent extensions that cast an interesting perspective on future generative modelling: Rather than having powerful likelihood models memorize local image details, we focus their representational power on scene composition. Time permitting, the talk will also cover approaches to 3D novel-view synthesis from only a single source image and with no need for a geometric prior model.
Björn Ommer is a full professor at LMU where he heads the Computer Vision & Learning Group (previously Computer Vision Group Heidelberg). Before he was a full professor at the Department of Mathematics and Computer Science of Heidelberg University and also served as a one of the directors of the Interdisciplinary Center for Scientific Computing (IWR) and of the Heidelberg Collaboratory for Image Processing (HCI).
He has studied computer science together with physics as a minor subject at the University of Bonn, Germany. After that he pursued his doctoral studies in computer science at ETH Zurich. He received his Ph.D. degree from ETH Zurich for his dissertation “Learning the Compositional Nature of Objects for Visual Recognition” which was awarded the ETH Medal. Thereafter, Björn held a post-doc position in the Computer Vision Group of Jitendra Malik at UC Berkeley.
He serves as an associate editor for the journal IEEE T-PAMI and previously for Pattern Recognition Letters. Björn is an ELLIS member and ELLIS unit faculty of the ELLIS unit Munich and a PI of the Munich Center for Machine Learning (MCML). He has served as Area Chair for multiple CVPR, ICCV, and ECCV conferences and as workshop and tutorial organizer at these venues.