Skywork-Reward-Gemma-2-27B is an advanced reward model based on the Gemma-2-27B architecture, specifically designed for preference handling in complex scenarios. It has been trained on 80K high-quality preference pair data from multiple fields, including mathematics, programming, and safety. The model ranked first on the RewardBench leaderboard in September 2024, showcasing its strong capabilities in handling preferences.