Russian tech giant Yandex has recently open-sourced its self-developed YaFSDP tool to the global AI community, which is currently the most efficient method for training optimization of large language models (LLMs). Compared to the widely used FSDP technology in the industry, YaFSDP can increase the training speed of LLMs by up to 26%, potentially saving significant GPU resources for AI developers and businesses.

YaFSDP (Yandex Full Sharded Data Parallel) is an enhanced version of FSDP by Yandex, focusing on optimizing GPU communication efficiency and memory usage, thereby eliminating bottlenecks in LLM training. YaFSDP demonstrates outstanding performance improvements in communication-intensive tasks such as pre-training, alignment, and fine-tuning, especially when training parameter sizes reach between 30 billion to 70 billion.

AI, Artificial Intelligence, Robotics, 2024d9dc94358d8e

Image source note: The image is generated by AI, provided by the image licensing service Midjourney

Mikhail Khruschev, a senior development expert at Yandex and a member of the YaFSDP team, said: "YaFSDP is best suited for widely used open-source models based on the LLaMA architecture. We are continuously optimizing and expanding its versatility across different model architectures and parameter sizes to enhance training efficiency in broader scenarios."

Estimates suggest that training a model with 70 billion parameters using YaFSDP can save approximately 150 GPU resources, equivalent to saving between $500,000 to $1.5 million in computational costs per month. This cost saving could make self-training LLMs more feasible for small and medium-sized enterprises and individual developers.

At the same time, Yandex also pledges to continue contributing to the development of the global AI community, with the open-sourcing of YaFSDP being a manifestation of this commitment. Previously, the company has shared several highly regarded open-source AI tools, such as the CatBoost high-performance gradient boosting library, AQLM extreme model compression algorithm, and Petals model training simplification library.

Industry analysts point out that as the scale of LLMs continues to expand, the improvement of training efficiency will become a key factor in the development of artificial intelligence. Technological breakthroughs like YaFSDP are expected to help the AI community advance large-model research more quickly and explore their application prospects in fields such as natural language processing and computer vision.