Last night, Stability AI released its most powerful model to date—Stable Diffusion 3.5. This isn't just a single model; it's a comprehensive suite featuring three versions designed to cater to a diverse range of users, from researchers and hobbyists to startups and enterprises.

The three versions include Stable Diffusion 3.5 Large, Stable Diffusion 3.5 Large Turbo, and the upcoming Stable Diffusion 3.5 Medium, set to be released on October 29th.

WeChat Screenshot_20241023082320.png

Stable Diffusion 3.5 Large is a foundational model with 8 billion parameters, renowned for its exceptional image quality and precision in prompt handling. It's ideal for professional use, capable of generating images up to 1 million pixels in resolution.

Stable Diffusion 3.5 Large Turbo is a distilled version of the former, capable of producing high-quality images in just four steps, significantly faster than Stable Diffusion 3.5 Large.

Stable Diffusion 3.5 Medium, with 2.5 billion parameters, utilizes an improved MMDiT-X architecture and training methodology, designed for plug-and-play functionality on consumer-grade hardware. It balances image quality with customizability, generating images with resolutions ranging from 0.25 to 2 million pixels.

WeChat Screenshot_20241023082357.png

The development of these models prioritizes customizability, integrating Query-Key Normalization into the transformer blocks to stabilize the training process and simplify further fine-tuning and development. To support the flexibility of downstream tasks, Stability AI has retained a broad knowledge base and diverse styles within the models, although this may increase the uncertainty of output results.

Stable Diffusion 3.5 models excel in several areas, including customizability, efficient performance, and diverse outputs. They can be easily fine-tuned to meet specific creative needs or built into applications tailored to custom workflows. Optimized for operation on standard consumer-grade hardware, they don't require high-end specifications. Additionally, these models can create images representative of the world without extensive prompts, and generate a wide range of styles and aesthetics, such as 3D, photography, painting, line art, and almost any imaginable visual style.

Stability AI also emphasizes its commitment to safety, implementing reasonable measures to prevent the misuse of Stable Diffusion 3.5 and focusing on integrity from the early stages of development. Moreover, the Stability AI community license is very permissive, allowing individuals and organizations to use the model for free for non-commercial purposes, including scientific research. Startups, SMEs, and creators with annual revenues under $1 million can also use the model for free for commercial purposes, retaining ownership of generated media without restrictive licensing constraints.

The Stable Diffusion 3.5 models are available for self-hosting on Hugging Face, with inference code also being open-source. Additionally, the models can be accessed through platforms such as Stability AI API, Replicate, ComfyUI, and DeepInfra.

Experience Link: https://huggingface.co/spaces/stabilityai/stable-diffusion-3.5-large