What would happen if AI were to retrain itself using images it generated? Recently, researchers from Stanford University and the University of California, Berkeley conducted an experiment that yielded surprising results.

The researchers found that when AI image generation models were retrained using images they themselves generated, these models produced highly distorted images. Worse still, this distortion was not limited to the text prompts used for retraining; once a model was "contaminated," it was difficult to fully recover, even if only real images were used for subsequent retraining.

The experiment began with an open-source model called Stable Diffusion (SD). The researchers first selected 70,000 high-quality face images from the FFHQ face dataset and automatically categorized them. They then used these real images as input to generate 900 images consistent with specific demographic characteristics through the Stable Diffusion model.

Next, the researchers used these generated images for iterative retraining of the model. They found that regardless of the proportion of self-generated images in the retraining dataset, the model would eventually collapse, with a sharp decline in the quality of the generated images. Even when the retraining dataset contained only 3% self-generated images, the phenomenon of model collapse persisted.

image.png

The experimental results showed that the baseline version of the Stable Diffusion model generated images that were consistent with text prompts and had high visual quality. However, after iterative retraining, the generated images began to exhibit semantic inconsistencies and visual distortions. The researchers also found that model collapse not only affected image quality but also resulted in a lack of diversity in the generated images.

To verify this, the researchers conducted control experiments, attempting to mitigate the impact of model collapse by adjusting the color histogram of the generated images and removing low-quality images. However, the results indicated that these measures were not effective in preventing model collapse.

The researchers also explored whether the model could recover after being "contaminated." They found that while in some cases, the quality of the generated images improved after multiple iterations of retraining, signs of model collapse still persisted. This suggests that once a model is "contaminated," the impact could be long-term, even irreversible.

This study reveals an important issue: currently popular diffusion-based text-to-image generation AI systems are highly sensitive to data "contamination." This contamination can occur unintentionally, such as by indiscriminately scraping images from online resources, or it can be a targeted attack, such as deliberately placing "contaminated" data on websites.

Facing these challenges, the researchers proposed some possible solutions, such as using image authenticity detectors to exclude AI-generated images, or adding watermarks to generated images. While these methods are not perfect, combined, they could significantly reduce the risk of data "contamination."

This study reminds us that the development of AI technology is not without risks. We need to handle AI-generated content more cautiously to ensure it does not have a long-term negative impact on our models and datasets. Future research needs to further explore how to make AI systems more resilient to this type of data "contamination" or develop technologies that can accelerate the "healing" of models.

Paper link: https://arxiv.org/pdf/2311.12202