You can edit almost every page by Creating an account. Otherwise, see the FAQ.

Diffusion model

From EverybodyWiki Bios & Wiki






Script error: No such module "Draft topics". Script error: No such module "AfC topic".

Diffusion models[1] are a way of generating realistic images using artificial intelligence. It works by applying Bayesian inference to reverse the process of adding random Gaussian noise to an image. There's a long Markov chain of typically 1000 steps of adding a little random noise at a time to a starting image to gradually degrade it. The reverse step-by-step process is called denoising. We start off with an image that's totally pure noise. We then gradually denoise it until we get a final image that looks realistic.

The neural network is trained directly on images with random noise added, and it's this trained network which is used for denoising.

Denoising an image

DALL-E[2] and Imagen[3] are some examples of diffusion models.

Technical details[edit]

Let x represent the image and y represent the text caption. Let t represent the fraction of random noise added to the image with being the image with a t fraction of noise added. The variance of the noise is proportional to t. One nice property of Gaussian noise is if you add a noise of variance t, and then add another independent noise of variance , this is equivalent to adding a single Gaussian noise of variance . The score function is defined as . This function is used as a parameter in denoising according to Bayes' theorem. A small step of denoising is approximately the same as subtracting a bit of Gaussian noise. A differentiable neural network is trained to predict the score function given the inputs , t and y. Using Bayes' theorem , we find the score function is . p(y|x) is given by the CLIP neural network. The first term is the unconditioned term which is caption independent. We can modify the score function to where is the guidance parameter.

References[edit]

  1. Ho, Jonathan; Jain, Ajay; Abbeel, Pieter (2020-12-16). "Denoising Diffusion Probabilistic Models". arXiv:2006.11239 [cs.LG].
  2. "DALL·E 2". OpenAI. Retrieved 2022-05-25.
  3. "Imagen: Text-to-Image Diffusion Models". imagen.research.google. Retrieved 2022-05-25.


This article "Diffusion model" is from Wikipedia. The list of its authors can be seen in its historical and/or the page Edithistory:Diffusion model. Articles copied from Draft Namespace on Wikipedia could be seen on the Draft Namespace of Wikipedia and not main one.

Page kept on Wikipedia This page exists already on Wikipedia.