AI News

AI News

Don't miss any moment of global AI innovation

AI Daily

Daily three-minute AI industry trends

AI Timeline

AI industry milestones

Al Hardware

Lists all AI hardware products.

AI Monetization Guide

Latest Cases

AI monetization case sharing

Image Collection

AI image creation monetization cases

Video Collection

AI video creation monetization cases

Audio Collection

AI audio creation monetization cases

Content Collection

AI content writing monetization cases

AI Tutorials

Latest Tutorials

Free sharing of the latest AI tutorials

AI Product Rankings

AI Product Ranking

Shows total visits ranking of AI websites

AI Traffic Growth Ranking

Track fastest growing AI websites by traffic

AI Traffic Decline Ranking

Focus on AI websites with significant traffic drops

AI Weekly Ranking

Shows weekly visits ranking of AI websites

Popular Country Rankings

United States

AI websites most popular with US users

China

AI websites most popular with Chinese users

India

AI websites most popular with Indian users

Brazil

AI websites most popular with Brazilian users

Popular Category Rankings

Image Generation

Total visits ranking of AI image generation websites

Personal Assistant

Total visits ranking of AI personal assistant websites

Character Generation

Total visits ranking of AI character generation websites

Video Generation

Total visits ranking of AI video generation websites

Popular Open Source Data Rankings

AI Project Ranking

GitHub popular AI projects by total stars

AI Project Growth Ranking

GitHub popular AI projects by growth rate

AI Developer Ranking

GitHub popular AI developer ranking

AI Organization Ranking

GitHub popular AI organization ranking

Popular Open Source Categories

Deepseek

GitHub popular deepseek open source projects

TTS

GitHub popular TTS open source projects

LLM

GitHub popular LLM open source projects

ChatGPT

GitHub popular ChatGPT open source projects

AI Open Source Project Library

Overview

Overview of GitHub popular AI open source projects

Product Library Tool Navigation

Zero Bubble Pipeline Parallelism

Implementation of a zero-bubble pipeline parallelism scheduling strategy

CommonProductProgrammingDistributed TrainingPipeline Parallelism

Zero Bubble Pipeline Parallelism is a crucial component of large-scale distributed training, and its efficiency is affected by pipeline bubbles. We introduce a scheduling strategy that successfully achieves zero pipeline bubbles under synchronous training semantics. The core idea behind this improvement is to divide backward calculation into two parts: one part calculates the gradients of the input, and the other part calculates the gradients of the parameters. Based on this idea, we manually designed novel pipeline scheduling, which significantly outperforms benchmark methods. We further developed an algorithm that automatically finds the optimal scheduling based on specific model configuration and memory constraints. Furthermore, to truly achieve zero bubbles, we introduce a novel technique that bypasses synchronization during optimizer steps. Experimental evaluation demonstrates that our method achieves up to 23% higher throughput than the 1F1B schedule under similar memory constraints. This number can further increase to 31% when memory constraints are relaxed. We believe our results mark an important step towards realizing the potential of pipeline parallelism.

Zero Bubble Pipeline Parallelism

Zero Bubble Pipeline Parallelism Visit Over Time

Monthly Visits

27175375

Bounce Rate

44.30%

Page per Visit

5.8

Visit Duration

00:04:57

Zero Bubble Pipeline Parallelism Visit Trend

Zero Bubble Pipeline Parallelism Visit Geography

Zero Bubble Pipeline Parallelism Traffic Sources

Zero Bubble Pipeline Parallelism Alternatives

Zero Bubble Pipeline Parallelism — Implementation of a zero-bubble pipeline parallelism scheduling strategy

•Distributed Training•Pipeline Parallelism

EPLB — An open-source algorithm for expert parallelism load balancing, designed to optimize expert allocation and load balancing in multi-GPU environments.

•Deep Learning•Load Balancing

DualPipe — A bidirectional pipeline parallel algorithm for overlapping computation and communication in V3/R1 training.

•Deep Learning•Distributed Training

LLaSA_training — LLaSA: Extending training and inference computational requirements for LLaMA-based speech synthesis

•Speech Synthesis•Deep Learning

prime — A framework for efficient global distributed training of AI models

•Distributed Training•Model Training

INTELLECT-1-Instruct — A language model with 1 billion parameters for English text and code.

•Text Generation•Distributed Training

Meta Lingua — Efficient repository for large language model (LLM) research.

•LLM•PyTorch

Prime Intellect — AI Development Platform for Scalable Democratization

InternationalSelection

•Distributed Training•Model Training

OpenDiLoCo — An open-source implementation for distributed low-bandwidth AI model training

•Distributed Training•Open Source