LMRax
PublicLMRax is a framework built on JAX to train transformers language models by reinforcement learning, along with the reward model training.
LMRax is a framework built on JAX to train transformers language models by reinforcement learning, along with the reward model training.