Technical Report on Stable Diffusion 3 Reveals Sora-like Architecture Details

量子位

Published inAI News · 1 min read · Mar 6, 2024

The technical report on Stable Diffusion 3 (SD3) provides a detailed overview of the multimodal diffusion Transformer architecture, MMDiT, used by SD3. This architecture enhances performance by employing separate sets of weights for image and text representations. The report also reveals the introduction of the reweighting stream technique in SD3 and discusses the scalability studies for future performance improvements. Additionally, the report addresses issues with the text encoder and offers recommendations. Overall, the technical innovations and performance of SD3 leave a profound impression.

Stable Diffusion 3 MMDiT Reweighted Flow Technology

This article is from AIbase Daily

Welcome to the [AI Daily] column! This is your daily guide to exploring the world of artificial intelligence. Every day, we present you with hot topics in the AI field, focusing on developers, helping you understand technical trends, and learning about innovative AI product applications.

—— Created by the AIbase Daily Team

AI News Recommendations

JUMPSTAR Releases Image Generation Model Step-1X-Medium with New Features such as Image-to-Image Generation

Shanghai JUMPSTAR Intelligent Technology Co., Ltd. recently announced a major upgrade to its Step-1X series of image generation models with the launch of the improved Step-1X-Medium version. This upgraded version has achieved significant enhancements in several areas: based on the MMDit architecture, the generation speed has increased by over 30%; through targeted training, the new version exhibits stronger understanding capability and text-image consistency, resulting in more natural details in the generated images.

Dec 26, 2024

2.0k

Stable Diffusion 3.5 Large Officially Launched on Amazon Bedrock Platform

Dec 20, 2024

2.8k

Shocking News! Stability AI's SD3.5L Introduces Three New ControlNet Features, Enhancing Image Generation Capabilities

Nov 29, 2024

3.3k

Stability AI Releases New Stable Diffusion 3.5 Generative Model with Three Versions and Enhanced Speed

Nov 7, 2024

8.7k

Free for Commercial Use! Stability AI Launches Lightweight AI Art Tool Stable Diffusion 3.5 Medium Model

Stability AI has once again broken through technical barriers by launching the new Stable Diffusion 3.5 Medium model. This AI art tool aimed at the general public is not only completely free for commercial use but also achieves a perfect balance between high performance and accessibility. With its multimodal diffusion transformer (MMDiT-X) architecture and a streamlined design of 2.5 billion parameters, this model cleverly addresses the hardware threshold for ordinary users, requiring only 9.9GB of VRAM to run smoothly on most consumer-grade graphics cards.

Oct 30, 2024

3.5k

AI Daily: Claude 3.5 Major Upgrade; Runway Launches Generative Character Performance Tool Act-One; Ideogram Introduces Image Magic Fill Feature; Stable Diffusion 3.5 Released

Welcome to the AI Daily section! Here is your daily guide to exploring the world of artificial intelligence. Every day we present hot topics in the AI field, focusing on developers to help you gain insights into technological trends and understand innovative AI product applications. Check out the latest AI products here: https://top.aibase.com/1. Say goodbye to expensive motion capture! This smart device provides seamless intelligent support, allowing users to enjoy convenience in work and entertainment.

Oct 23, 2024

170

Genmo Launches Powerful Open Source Video Generation Model Mochi 1: High Quality, Ultra Smooth, Create Hollywood-Level Movies on Your Home Computer!

Oct 23, 2024

5.4k

Free for Personal Commercial Use! Stability AI Releases Stable Diffusion 3.5 Series Text-to-Image Models

Last night, Stability AI unveiled its most powerful model - Stable Diffusion 3.5, which is not just a single model, but a package containing three versions, designed to meet the diverse needs of researchers, business enthusiasts, startups, and enterprises. The three versions are Stable Diffusion 3.5 Large, Stable Diffusion 3.5 Large Turbo, and one that will be available on October 29.

Oct 23, 2024

3.5k

AI Daily: Luma AI's Video Generation Model Rivals Sora in Impact; Stable Diffusion 3 Officially Open-Sourced; Suno Launches Audio Input Feature; Alibaba Unveils Super Blend Image Tool MimicBrush

Welcome to the [AI Daily] column! Here is your daily guide to exploring the world of artificial intelligence. Every day, we present the hot topics in the AI field, focusing on developers, helping you understand technical trends and innovative AI product applications. Fresh AI products click to learn more: https://top.aibase.com/ 1、Luma AI Releases the Groundbreaking Text-to-Video Model Dream Machine, on Par with Sora Luma AI's latest text-to-video model, Dream Machine, is now available for free

Jun 13, 2024

230

Visual Effects Expert Takes Over as CEO of Stability AI to Drive New Developments in Text-to-Image Generation

UK-based AI startup Stability AI has appointed Prem Akkaraju as its new CEO. Akkaraju previously served as the CEO of visual effects company Weta Digital and is also one of Stability AI's investors. The investor lineup for Stability AI is impressive, including former Facebook president Sean Parker among others, with total investments exceeding 1.

Jun 24, 2024

1.6k

AI News

AI Daily

AI Timeline

Latest Cases

Image Collection

Video Collection

Audio Collection

Content Collection

Latest Tutorials

AI Product Ranking

AI Traffic Growth Ranking

AI Traffic Decline Ranking

AI Weekly Ranking

United States

China

India

Brazil

Image Generation

Personal Assistant

Character Generation

Video Generation

AI Project Ranking

AI Project Growth Ranking

AI Developer Ranking

AI Organization Ranking

Deepseek

TTS

LLM

ChatGPT

Overview

Technical Report on Stable Diffusion 3 Reveals Sora-like Architecture Details

量子位

This article is from AIbase Daily

AI News Recommendations

JUMPSTAR Releases Image Generation Model Step-1X-Medium with New Features such as Image-to-Image Generation

Stable Diffusion 3.5 Large Officially Launched on Amazon Bedrock Platform

Shocking News! Stability AI's SD3.5L Introduces Three New ControlNet Features, Enhancing Image Generation Capabilities

Stability AI Releases New Stable Diffusion 3.5 Generative Model with Three Versions and Enhanced Speed

Free for Commercial Use! Stability AI Launches Lightweight AI Art Tool Stable Diffusion 3.5 Medium Model

AI Daily: Claude 3.5 Major Upgrade; Runway Launches Generative Character Performance Tool Act-One; Ideogram Introduces Image Magic Fill Feature; Stable Diffusion 3.5 Released

Genmo Launches Powerful Open Source Video Generation Model Mochi 1: High Quality, Ultra Smooth, Create Hollywood-Level Movies on Your Home Computer!

Free for Personal Commercial Use! Stability AI Releases Stable Diffusion 3.5 Series Text-to-Image Models

AI Daily: Luma AI's Video Generation Model Rivals Sora in Impact; Stable Diffusion 3 Officially Open-Sourced; Suno Launches Audio Input Feature; Alibaba Unveils Super Blend Image Tool MimicBrush

Visual Effects Expert Takes Over as CEO of Stability AI to Drive New Developments in Text-to-Image Generation