This is the first of a series of three DALL-E 2 talks delivered at the (usually not recorded) weekly journal club "Deep Learning: Classics and Trends" (
https://mlcollective.org/dlct ).
Speaker: Aditya Ramesh
Title: Manipulating Images with DALL-E 2
Abstract: I'll begin by motivating the hierarchical modeling approach called unCLIP that underlies DALL-E 2. Then, I'll present a demo of three of the capabilities that it enables: inpainting, variations, and text diffs. These capabilities rely on the bipartite latent representation that unCLIP provides for any given image. I'll discuss how images are mapped to this representation, and how it can be used to implement the capabilities shown earlier. Finally, I'll show how inverting the representations of a discriminative model allows us to gain insight behind its "stupid mistakes", as well as limitations of the unCLIP approach.
Speaker bio: Aditya is a researcher at OpenAI who works on generative models
Paper link:
https://arxiv.org/abs/2204.06125
Социальные закладки