AIbase
Product LibraryTool Navigation

GRPO

Public

使用GRPO算法训练一个10亿参数的LLaMa 3.2模型(训练完成后即将部署!)

Creat2025-03-07T18:44:06
Update2025-04-09T04:14:29
1
Stars
0
Stars Increase

Related projects