YaFSDP
An efficient distributed data parallelism framework designed for large language models.
CommonProductProgrammingDistributed ComputingData Parallelism
YaFSDP is a distributed data parallelism framework designed to work well with transformer-like neural network architectures. It is 20% faster than the traditional FSDP when pre-training large language models (LLMs) and performs better under high-memory pressure conditions. YaFSDP aims to reduce the overhead of communication and memory operations.
YaFSDP Visit Over Time
Monthly Visits
494758773
Bounce Rate
37.69%
Page per Visit
5.7
Visit Duration
00:06:29