EasyContext
EasyContext demonstrates how to leverage existing technologies to train language models with 700K and 1M context lengths.
CommonProductProgrammingLanguage ModelsContext Length
EasyContext is an open-source project aimed at enabling the training of language models with a 1 million-word context length using ordinary hardware. It primarily utilizes techniques such as sequence parallelism, DeepSpeed Zero3 offloading, Flash Attention, and activation checkpointing. Rather than proposing novel innovations, the project showcases how to combine existing tools to achieve this goal. It has successfully trained two models, Llama-2-7B and Llama-2-13B, achieving 700K and 1M word context lengths respectively on 8 A100 and 16 A100 GPUs.
EasyContext Visit Over Time
Monthly Visits
515580771
Bounce Rate
37.20%
Page per Visit
5.8
Visit Duration
00:06:42