rlhf-trl

Public

Reinforcement Learning from Human Feedback with ? TRL

human-feedback reinforcment-learning rlhf

Creat：2023-06-10T23:16:02

Update：2025-03-23T22:12:20

Stars

Stars Increase

Related projects

LLaMA Factory

agent

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

47621

1个月前

+4today

Open Assistant

OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamically to do so.

37324

1个月前

+1today

LLMSurvey

chain-of-thought

The official GitHub page for the survey paper "A Survey of Large Language Models".

11409

1个月前

PaLM Rlhf Pytorch

artificial-intelligence

Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the PaLM architecture. Basically ChatGPT but with PaLM

7791

1个月前

Chinese LLaMA Alpaca 2

64k

中文LLaMA-2 & Alpaca-2大模型二期项目 + 64K超长上下文模型 (Chinese LLaMA-2 & Alpaca-2 LLMs with 64K long context models)

7158

1个月前

-2today

InternLM

chatbot

Official release of InternLM series (InternLM, InternLM2, InternLM2.5, InternLM3).

6881

1个月前

+3today

Alignment Handbook

llm

Robust recipes to align language models with human and AI preferences

5144

1个月前

+1today

Awesome RLHF

deep-learning

A curated list of reinforcement learning with human feedback resources (continually updated)

3902

1个月前

ChatGLM Efficient Tuning

alpaca

Fine-tuning ChatGLM-6B with PEFT | 基于 PEFT 的高效 ChatGLM 微调

3699

1个月前

Align Anything

chameleon

Align Anything: Training All-modality Model with Feedback

3484

1个月前

AI News

AI Daily

AI Timeline

Al Hardware

Latest Cases

Image Collection

Video Collection

Audio Collection

Content Collection

Latest Tutorials

AI Product Ranking

AI Traffic Growth Ranking

AI Traffic Decline Ranking

AI Weekly Ranking

United States

China

India

Brazil

Image Generation

Personal Assistant

Character Generation

Video Generation

AI Project Ranking

AI Project Growth Ranking

AI Developer Ranking

AI Organization Ranking

Deepseek

TTS

LLM

ChatGPT

Overview

rlhf-trl

Related projects

LLaMA Factory

Open Assistant

LLMSurvey

PaLM Rlhf Pytorch

Chinese LLaMA Alpaca 2

InternLM

Alignment Handbook

Awesome RLHF

ChatGLM Efficient Tuning

Align Anything