Light-R1-14B-DS
An open-source 14B-parameter mathematical model, trained using reinforcement learning, with excellent performance.
CommonProductProductivityReinforcement LearningMathematical Model
Light-R1-14B-DS is an open-source mathematical model developed by Qihoo 360 Technology Co., Ltd. Trained using reinforcement learning based on DeepSeek-R1-Distill-Qwen-14B, it achieved high scores of 74.0 and 60.2 on the AIME24 and AIME25 mathematics competition benchmarks, respectively, surpassing many 32B parameter models. It successfully implemented reinforcement learning on an already long-chain reasoning fine-tuned model under a lightweight budget, providing the open-source community with a powerful mathematical model tool. Its open-source nature promotes the application of natural language processing in education, particularly in mathematical problem-solving, offering researchers and developers valuable research foundations and practical tools.
Light-R1-14B-DS Visit Over Time
Monthly Visits
29742941
Bounce Rate
44.20%
Page per Visit
5.9
Visit Duration
00:04:44