OLMo-2-1124-13B-DPO is a 13 billion parameter large language model that has undergone supervised fine-tuning and DPO training. It primarily targets English and aims to provide exceptional performance across various tasks such as chat, mathematics, GSM8K, and IFEval. This model is part of the OLMo series, designed to advance scientific research in language models. The training is based on the Dolma dataset, and the code, checkpoints, logs, and training details are publicly available.