OpenDiT
OpenDiT: A simple, fast, and efficient DiT training and inference system.
CommonProductProductivityDiTTraining
OpenDiT is an open-source project providing a high-performance implementation of Diffusion Transformer (DiT) based on Colossal-AI. It is designed to enhance the training and inference efficiency of DiT applications, including text-to-video and text-to-image generation. OpenDiT achieves performance improvements through the following technologies:
* GPU acceleration up to 80% and 50% memory reduction;
* Core optimizations including FlashAttention, Fused AdaLN, and Fused layernorm;
* Mixed parallelism methods such as ZeRO, Gemini, and DDP, along with model sharding for ema models to further reduce memory costs;
* FastSeq: A novel sequence parallelism method particularly suitable for workloads like DiT, where activations are large but parameters are small. Single-node sequence parallelism can save up to 48% in communication costs and break through the memory limit of a single GPU, reducing overall training and inference time;
* Significant performance improvements can be achieved with minimal code modifications;
* Users do not need to understand the implementation details of distributed training;
* Complete text-to-image and text-to-video generation workflows;
* Researchers and engineers can easily use and adapt our workflows to real-world applications without modifying the parallelism part;
* Training on ImageNet for text-to-image generation and releasing checkpoints.
OpenDiT Visit Over Time
Monthly Visits
494758773
Bounce Rate
37.69%
Page per Visit
5.7
Visit Duration
00:06:29