Generating Images with AI Diffusion Models

Pope Francis wearing Balanciaga

What's behind the magic?

Harry Potter Magic Patronum Spell Xvga

How do diffusion models associate images with words?

How can diffusion models generate hybrids of different images?

Harry Potter Hagrid Balenciaga Meme Xvga

Associating images with words

Uploading images

Alt Tag In Wordpress Star Wars

Viewing images

Alt Tag In Html Star Wars

Captions captured in LAION/CLIP dataset

Ai Captions In Laion Network Diagram

Generating hybrid images

If generative AI creates by "averaging," can we just average the relevant images in our dataset to create new hybrids?

martian_daisy_10
martian_daisy_10
martian_daisy_20
martian_daisy_20

AI can't generate hybrid images just by averaging pixels ☹️

A better approach to generating hybrid images

The solution turns out to be related to the problem of removing noise from a photo 🤔

The reason is that the best de-noising algorithms work by segmenting the image into shapes.

denoise_10
denoise_10
denoise_20
denoise_20
denoise_30
denoise_30
denoise_40
denoise_40
denoise_50
denoise_50
denoise_60
denoise_60
denoise_70
denoise_70

Finding noise means finding shapes

A de-noising algorithm can find shapes in noise 💪

💡 So instead of averaging the original photos, we could

  1. Add noise to both images
  2. Average the noisy results
  3. De-noise to find new shapes
martian_daisy_30
martian_daisy_30
martian_daisy_40
martian_daisy_40
martian_daisy_50
martian_daisy_50

The transporter analogy

Star Trek Voyager, Tuvix

An infamous Star Trek episode about a transporter malfunction offers an analogy:

Creating a hybrid of two people is easier while their molecules are scrambled than in their original form.

tuvix_10
tuvix_10
tuvix_20
tuvix_20
tuvix_30
tuvix_30
tuvix_40
tuvix_40

Results of averaging noisy images

Stv Tuvix Smiling
Harry Potter Balenciaga Meme Xvga

/