Eurus-2-7B-PRIME

A 7B parameter language model trained based on the PRIME methodology, specifically designed to enhance reasoning capabilities.

CommonProductProgrammingReinforcement LearningReasoning Capability
PRIME-RL/Eurus-2-7B-PRIME is a language model with 7 billion parameters, trained on the PRIME methodology with the aim of improving reasoning abilities via online reinforcement learning. Starting from the Eurus-2-7B-SFT model, this model was fine-tuned using the Eurus-2-RL-Data dataset. The PRIME methodology employs an implicit reward system, fostering an emphasis on the reasoning process during output generation, rather than focusing solely on the results. This model has demonstrated exceptional performance in various reasoning benchmark tests, achieving an average improvement of 16.7% over its SFT version. Key advantages include enhanced reasoning capabilities, lower data and resource requirements, and outstanding performance in mathematical and programming tasks. It is well-suited for scenarios requiring complex reasoning abilities, such as programming and mathematical problem solving.
Visit

Eurus-2-7B-PRIME Visit Over Time

Monthly Visits

20899836

Bounce Rate

46.04%

Page per Visit

5.2

Visit Duration

00:04:57

Eurus-2-7B-PRIME Visit Trend

Eurus-2-7B-PRIME Visit Geography

Eurus-2-7B-PRIME Traffic Sources

Eurus-2-7B-PRIME Alternatives