AIbase
Product LibraryTool Navigation

DPPO

Public

Official implementation of "Direct Preference-based Policy Optimization without Reward Modeling" (NeurIPS 2023)

Creat2023-10-08T21:41:43
Update2025-01-06T10:47:38
42
Stars
0
Stars Increase

Related projects