MOSS-RLHF

Public

Secrets of RLHF in Large Language Models Part I: PPO

ai-safety alignment rlhf

Creat：2023-07-05T22:11:03

Update：2025-03-25T20:14:54

1.4K

Stars

Stars Increase

Related projects

LLaMA Factory

Hot

agent

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

47521

4周前

+109today

Pandas

alignment

Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more

45229

4周前

+16today

Open Assistant

OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamically to do so.

37319

4周前

+4today

LLMSurvey

chain-of-thought

The official GitHub page for the survey paper "A Survey of Large Language Models".

11401

1个月前

+8today

Chinese LLaMA Alpaca 2

64k

中文LLaMA-2 & Alpaca-2大模型二期项目 + 64K超长上下文模型 (Chinese LLaMA-2 & Alpaca-2 LLMs with 64K long context models)

7160

1个月前

+1today

InternLM

chatbot

Official release of InternLM series (InternLM, InternLM2, InternLM2.5, InternLM3).

6878

4周前

+3today

Alignment Handbook

llm

Robust recipes to align language models with human and AI preferences

5143

4周前

+3today

Awesome RLHF

deep-learning

A curated list of reinforcement learning with human feedback resources (continually updated)

3900

4周前

+4today

ChatGLM Efficient Tuning

alpaca

Fine-tuning ChatGLM-6B with PEFT | 基于 PEFT 的高效 ChatGLM 微调

3699

1个月前

Align Anything

chameleon

Align Anything: Training All-modality Model with Feedback

3469

4周前

+14today

AI News

AI Daily

AI Timeline

Al Hardware

Latest Cases

Image Collection

Video Collection

Audio Collection

Content Collection

Latest Tutorials

AI Product Ranking

AI Traffic Growth Ranking

AI Traffic Decline Ranking

AI Weekly Ranking

United States

China

India

Brazil

Image Generation

Personal Assistant

Character Generation

Video Generation

AI Project Ranking

AI Project Growth Ranking

AI Developer Ranking

AI Organization Ranking

Deepseek

TTS

LLM

ChatGPT

Overview

MOSS-RLHF

Related projects

LLaMA Factory

Pandas

Open Assistant

LLMSurvey

Chinese LLaMA Alpaca 2

InternLM

Alignment Handbook

Awesome RLHF

ChatGLM Efficient Tuning

Align Anything