ReFT

ReFT enhances the reasoning ability of LLM

CommonProductProductivityArtificial IntelligenceReasoning

ReFT is a simple yet effective method for enhancing the reasoning capabilities of large language models (LLMs). It first preheats the model through supervised fine-tuning (SFT), and then further fine-tunes the model using online reinforcement learning, specifically the PPO algorithm presented in this paper. ReFT significantly outperforms SFT by automatically sampling a large number of reasoning paths for a given problem and naturally deriving rewards from the true answers. ReFT's performance can be further improved by combining reasoning strategies (such as majority voting and re-ranking). It's noteworthy that ReFT achieves improvements by learning from the same training questions as SFT, without relying on additional or enhanced training questions. This demonstrates ReFT's stronger generalization ability.

Product Finder

Product Submit

AI Models Finder

MCP Servers

MCP Client

MCP Inspector

Case Tutorials

Latest AI News

AI Daily Brief

ReFT

ReFT Visit Over Time

ReFT Visit Trend

ReFT Visit Geography

ReFT Traffic Sources

ReFT Alternatives

Expert Specialized Fine-Tuning — A professional fine-tuning tool for customizing large language models.

Fine Tuner AI — No-code Fine-Tuning for Optimizing AI Performance

diffusion-e2e-ft — Fine-tuning tool for image-conditioned diffusion models

mistral-finetune — Lightweight codebase for efficient fine-tuning of the Mistral model.

llm-datasets — High-quality datasets, tools, and concepts for fine-tuning large language models.

Finetune — A platform for fine-tuning AI intelligent agents.

prompteasy.ai — AI model fine-tuning, personalized customization.

mwp_ReFT — A deep reinforcement learning-based model fine-tuning framework

Astraios — Parameter-efficient Fine-tuning for Large Language Models

SFR-Judge — An intelligent evaluation tool that accelerates model assessment and fine-tuning.

ReFT — ReFT enhances the reasoning ability of LLM

In-Context LoRA for Diffusion Transformers — A context-based LoRA fine-tuning technique for diffusion transformers

Tülu 3 — Open-source advanced language model fine-tuning framework

Orthogonal Finetuning (OFT) — OFT effectively stabilizes text-to-image diffusion models during fine-tuning

Trudo AI — A no-code platform for fine-tuning OpenAI GPT3 models.

XiangJi Translate — AI Short Video Translation Launch, Multi-language Fine-tuning Tool

Bakery — An open-source platform for AI model fine-tuning and monetization, empowering AI startups, machine learning engineers, and researchers.

lmms-finetune — A unified codebase for fine-tuning large multimodal models.

RAG-FiT — RAG-FiT is a library designed to enhance LLMs' capability to utilize external information by fine-tuning models with specifically created RAG-enhanced datasets.

AIKit — A one-stop solution for hosting, deploying, building, and fine-tuning open-source large language models.

XTuner — A high-efficiency and flexible toolkit for large-scale model fine-tuning.

Cola — Large language models are visual reasoning coordinators.

MAmmoTH-VL — A Large-Scale Multimodal Reasoning and Instruction Tuning Platform

InternThinker — A strong reasoning AI model developed by the Shanghai Artificial Intelligence Laboratory.

Phi-3.5-mini-instruct — A lightweight, multilingual advanced text generation model

Physical Intelligence — Bringing General Artificial Intelligence to the Physical World

ARC-AGI — Artificial Intelligence General Reasoning Test Dataset

Bespoke Labs — Customized data services to facilitate precise model tuning.

bilibot — A local chatbot trained by fine-tuning Bilibili user comments

ASPIRE — A framework to enhance the selective prediction capability of large language models