Eureka

A human-level reward design algorithm implemented by encoding large language models.

CommonProductProgrammingReward DesignReinforcement Learning

Eureka is a human-level reward design algorithm implemented by encoding large language models. It leverages the zero-shot generation, code writing, and context improvement capabilities of state-of-the-art language models (such as GPT-4) to evolve and optimize reward code. The generated rewards can be used to acquire complex skills through reinforcement learning. Eureka-generated reward functions outperform human-expert-designed reward functions in 29 open-source reinforcement learning environments, including 10 diverse robot morphologies. Eureka is also capable of flexibly improving reward functions to enhance the quality and safety of generated rewards. By combining with curriculum learning, using Eureka reward functions, we first demonstrate a simulated Shadow Hand performing a pen-twirling trick, skillfully manipulating the pen at a fast speed within a circle.

Visit

Eureka Visit Over Time

Monthly Visits

3827

Bounce Rate

59.62%

Page per Visit

1.3

Visit Duration

00:00:11

Eureka Visit Trend

Eureka Visit Geography

AI News

AI Daily

AI Timeline

Al Hardware

Latest Cases

Image Collection

Video Collection

Audio Collection

Content Collection

Latest Tutorials

AI Product Ranking

AI Traffic Growth Ranking

AI Traffic Decline Ranking

AI Weekly Ranking

United States

China

India

Brazil

Image Generation

Personal Assistant

Character Generation

Video Generation

AI Project Ranking

AI Project Growth Ranking

AI Developer Ranking

AI Organization Ranking

Deepseek

TTS

LLM

ChatGPT

Overview

Eureka

Eureka Visit Over Time

Eureka Visit Trend

Eureka Visit Geography

Eureka Traffic Sources

Eureka Alternatives

Eureka — A human-level reward design algorithm implemented by encoding large language models.

Self-Rewarding Language Models — Language Model Self-Reward Training

Nemotron-4-340B-Reward — A multi-dimensional reward model to help build custom large language models.

PRIME-RL — PRIME enhances the reasoning abilities of language models through implicit reward-driven online reinforcement learning.

d1 — Improving the reasoning capabilities of diffusion large language models using reinforcement learning.

Language Learning Games — AI text adventure games for language learning

DiffusionRL — Large-scale Reinforcement Learning for Diffusion Models

WARM — Enhances the efficiency and reliability of large language models (LLMs) through weighted average reward modeling.

Search-R1 — A highly efficient reinforcement learning framework for training language models that perform reasoning and call search engines.

DIAMOND — A reinforcement learning agent trained in a diffusion world model

Language Atlas — Free language learning

Models Table — A comprehensive list and information about large language models

RL4VLM — An open-source project that fine-tunes large vision-language models via reinforcement learning to act as decision-making agents.

SWE-RL — Enhancing the reasoning capabilities of large language models in open-source software evolution through reinforcement learning.

Language REACTOR — A powerful language learning toolkit

SERL — SERL is an efficient robot reinforcement learning software suite

Phi Open Models — Phi Open Models are powerful, cost-effective, low-latency small language models.

Tülu 3 405B — Tülu 3 405B is a large-scale open-source language model enhanced through reinforcement learning.

Unitree RL GYM — Unitree robot platform for reinforcement learning

EurusPRM-Stage1 — EurusPRM-Stage1 is a reinforcement learning model based on implicit process rewards, aimed at enhancing the reasoning abilities of generative models.

agibot_x1_train — Modular humanoid robot for reinforcement learning training

RLVR-GSM-MATH-IF-Mixed-Constraints — A dataset of math problems for reinforcement learning validation.

JaxMARL — JaxMARL - A multi-agent reinforcement learning library

HelpSteer2 — An open-source dataset designed for training high-performance reward models.

EAGLE — Exploration of the design space for multimodal large language models

mwp_ReFT — A deep reinforcement learning-based model fine-tuning framework

VLM-R1 — VLM-R1 is a stable and versatile reinforcement learning-enhanced visual-language model focused on visual understanding tasks.

Large World Models — Large World Models: Understanding Video and Language

EurusPRM-Stage2 — EurusPRM-Stage2 is a reinforcement learning model based on implicit process rewards aimed at enhancing the reasoning capabilities of generative models.

DigiRL — Train outdoor device control agents using autonomous reinforcement learning