Translated data: The Allen Institute for AI, in collaboration with multiple universities, has released OLMo, the world's first 100% open-source large model, encompassing weights, code, datasets, and the entire training process. Performance evaluations indicate that OLMo-7B outperforms in several tasks. Additionally, researchers have made the pre-training dataset Dolma publicly available, promoting open research in the field of language model pre-training. In terms of data transparency, tools for data organization and analysis have been provided.