// Google Tag Mamager // Google Tag Mamager Skip to main content

New research findings from Google, and Princeton University demonstrates that some AI image generators “memorize” and reproduce images from their training data.

In rare cases, this could lead to copyright infringement or sensitive information appearing in AI training sets.

The Stable Diffusion (SD) model had a memorization rate of 0.03%, demonstrating the strengths of its filtering and de-duplication algorithms.

Stable Diffusion relies on a type of algorithm known as latent diffusion (more on that here). Latent Diffusion has the benefit of producing detailed, realistic, high quality images, but now appears prone to reproducing work. Another type of machine learning framework, Generative Adversarial Networks (GANs), do not suffer from this problem but do not produce content as robust as SD.

This study highlights the risks that generative AIs may unintentionally reveal private information such as medical records.

Leave a Reply