Stability AI has recently launched its latest deep learning text-to-image generation model — Stable Diffusion 3.5. This version includes three improved open-source models designed to cater to the needs of different users, including researchers, corporate clients, and enthusiasts.
Among them, Stable Diffusion 3.5 Large is the most powerful model in the series, with parameters as high as 8.1 billion. This model, known for its exceptional image quality and high responsiveness to prompts, is the ideal choice for professional users, capable of generating high-quality images with resolutions up to 1 trillion pixels.
Additionally, Stable Diffusion 3.5 Large Turbo is a simplified version of Stable Diffusion 3.5 Large. It significantly enhances speed while generating high-quality images, completing image generation in just four steps, making it more efficient compared to the previous version, suitable for users in need of quick creation.
Another new model is Stable Diffusion 3.5 Medium, which has 2.5 billion parameters. This model employs an improved MMDiT-X architecture and training methods, designed to be "plug-and-play," smoothly running even on consumer-grade hardware. It strikes a good balance between image generation quality and ease of customization, capable of generating images from 0.25 to 2 trillion pixels.
The backdrop for this release is that after the Stable Diffusion 3 Medium released in June failed to meet expectations, Stability AI decided to introduce a more transformative solution. The company stated that they hope to regain market competitiveness through this update, in response to challenges from platforms like OpenAI's DALL-E and Midjourney.
A significant technological innovation of the new model is the introduction of Query-Key Normalization technology. This innovation enhances the model's customization and responsiveness to prompts, allowing users to achieve more consistent results with explicit prompts and richer image interpretations with broader prompts.
The Stable Diffusion 3.5 series models will be released under Stability AI's community license, allowing users to use them for free for non-commercial purposes. Additionally, entities with annual revenues below $1 million can also use them for free for commercial purposes, while those exceeding this revenue need to apply for a corporate license.
All models and the weights required for self-hosting will be available on Hugging Face and Stability AI's API. Furthermore, advanced image customization options via ControlNets functionality are expected to be launched in the coming days.
Official Entrance:
https://stability.ai/stable-image
Three Versions Hugging Face Entrance:
https://huggingface.co/stabilityai/stable-diffusion-3.5-large
https://huggingface.co/stabilityai/stable-diffusion-3.5-large-turbo
https://huggingface.co/stabilityai/stable-diffusion-3.5-medium
Key Points:
🌟 The newly launched Stable Diffusion 3.5 offers three model versions to meet different user needs.
⚡ Stable Diffusion 3.5 Large Turbo features faster image generation speed, ideal for quick creation.
📈 The new model introduces Query-Key Normalization technology, improving customization and responsiveness.