DCLM-7B

700 million parameter language model, demonstrating the effectiveness of data organization technology.

PremiumNewProductProgramminglanguage modelTransformer
DCLM-Baseline-7B is a 700 million parameter language model developed by the DataComp for Language Models (DCLM) team, primarily in English. The model aims to improve the performance of language models by using systematic data organization technology. The model training uses PyTorch and the OpenLM framework, the optimizer is AdamW, the learning rate is 2e-3, the weight decay is 0.05, the batch size is 2048 sequences, the sequence length is 2048 tokens, and the total training tokens has reached 2.5 trillion. The training hardware uses the H100 GPU.
Visit

DCLM-7B Visit Over Time

Monthly Visits

19075321

Bounce Rate

45.07%

Page per Visit

5.5

Visit Duration

00:05:32

DCLM-7B Visit Trend

DCLM-7B Visit Geography

DCLM-7B Traffic Sources

DCLM-7B Alternatives