Skywork-Reward-Llama-3.1-8B
An advanced reward model for text classification and preference assessment.
CommonProductProgrammingMachine LearningNatural Language Processing
Skywork-Reward-Llama-3.1-8B is an advanced reward model based on the Meta-Llama-3.1-8B-Instruct architecture. It has been trained using the Skywork Reward Data Collection, which consists of 80,000 high-quality preference pairs. The model excels in handling preferences in complex scenarios, particularly with challenging preference pairs spanning multiple domains, including mathematics, programming, and security. As of September 2024, the model ranks third on the RewardBench leaderboard.
Skywork-Reward-Llama-3.1-8B Visit Over Time
Monthly Visits
21315886
Bounce Rate
45.50%
Page per Visit
5.2
Visit Duration
00:05:02