Disney's research team has recently introduced a groundbreaking image compression technique utilizing the open-source Stable Diffusion V1.2 model. This method can produce more realistic images at lower bit rates compared to its competitors. Known as the "codec," it is significantly more complex than traditional JPEG and AV1 codecs, yet its performance is astonishing.

image.png

The study shows that the new method excels in restoring image details while drastically reducing training costs. Researchers found that quantization errors (a core process in image compression) are very similar to noise (a core process in diffusion models), thus traditional quantized images can be seen as noise versions of the original images. In this process, the denoising process of the diffusion model is used to reconstruct images at the target bit rate.

image.png

In a series of tests, Disney's new method outperformed previous image compression technologies in terms of accuracy and detail recovery. The researchers stated that their method does not require additional fine-tuning of the diffusion model and can effectively use the existing foundational models. The superiority of this new codec lies in its excellent performance in reconstructing realism, although in some cases, it may exhibit "hallucination," generating details that do not exist in the original images.

Although this compression method has some impact on the presentation of art pieces and ordinary photos, the potential risks of hallucination are more significant in applications that rely on detail, such as court evidence, facial recognition data, and optical character recognition (OCR) scans. Currently, although this technology is still in its infancy, challenges in this field will gradually emerge as AI-enhanced image compression technology advances.

To make image storage more efficient, Disney's team, after long exploration, finally introduced this new technology. They trained on the Vimeo-90k dataset and tested on multiple datasets, showing that the method outperforms previous methods on several image quality metrics. Ultimately, user studies confirmed the superiority of their method in practical applications.

Paper: https://studios.disneyresearch.com/app/uploads/2024/09/Lossy-Image-Compression-with-Foundation-Diffusion-Models-Supplementary-1.pdf

Key Points:

1. 🖼️ Disney's new AI image compression technology can generate more realistic images at lower bit rates.

2. ⚙️ The method excels in detail recovery and training cost, requiring no additional fine-tuning.

3. ⚠️ Despite its significant effects, it may generate details not present in the original images, posing a "hallucination" risk.