Originally published at: https://developer.nvidia.com/blog/fast-inversion-for-real-time-image-editing-with-text/
Text-to-image diffusion models can generate diverse, high-fidelity images based on user-provided text prompts. They operate by mapping a random sample from a high-dimensional space, conditioned on a user-provided text prompt, through a series of denoising steps. This results in a representation of the corresponding image, . These models can also be used for more complex…