Megatron-LM
Continuous research on training Transformer models at scale.
CommonProductProductivityTransformerLanguage Model
Megatron-LM is a powerful large-scale Transformer model developed by NVIDIA's Applied Deep Learning Research team. It is used in continuous research on training Transformer language models at scale. We utilize mixed precision, efficient model parallelism and data parallelism, along with the pre-training of multi-node Transformer models such as GPT, BERT, and T5.
Megatron-LM Visit Over Time
Monthly Visits
494758773
Bounce Rate
37.69%
Page per Visit
5.7
Visit Duration
00:06:29