oat
Public? OAT: A research-friendly framework for LLM online alignment, including preference learning, reinforcement learning, etc.
? OAT: A research-friendly framework for LLM online alignment, including preference learning, reinforcement learning, etc.