Nemotron-4-340B-Reward

A multi-dimensional reward model to help build custom large language models.

CommonProductProgrammingLarge Language ModelSynthetic Data Generation
Nemotron-4-340B-Reward, developed by NVIDIA, is a multi-dimensional reward model used in synthetic data generation pipelines to assist researchers and developers in building their own LLMs. Composed of the Nemotron-4-340B-Base model and a linear layer, it converts response end-of-sequence markers into five scalar values corresponding to HelpSteer2 attributes. It supports a maximum context length of 4096 tokens and can score five attributes for each assistant turn.
Visit

Nemotron-4-340B-Reward Visit Over Time

Monthly Visits

18200568

Bounce Rate

44.11%

Page per Visit

5.8

Visit Duration

00:05:46

Nemotron-4-340B-Reward Visit Trend

Nemotron-4-340B-Reward Visit Geography

Nemotron-4-340B-Reward Traffic Sources

Nemotron-4-340B-Reward Alternatives