Nemotron-4-340B-Reward

A multi-dimensional reward model to help build custom large language models.

CommonProductProgrammingLarge Language ModelSynthetic Data Generation
Nemotron-4-340B-Reward, developed by NVIDIA, is a multi-dimensional reward model used in synthetic data generation pipelines to assist researchers and developers in building their own LLMs. Composed of the Nemotron-4-340B-Base model and a linear layer, it converts response end-of-sequence markers into five scalar values corresponding to HelpSteer2 attributes. It supports a maximum context length of 4096 tokens and can score five attributes for each assistant turn.
Visit

Nemotron-4-340B-Reward Visit Over Time

Monthly Visits

17788201

Bounce Rate

44.87%

Page per Visit

5.4

Visit Duration

00:05:32

Nemotron-4-340B-Reward Visit Trend

Nemotron-4-340B-Reward Visit Geography

Nemotron-4-340B-Reward Traffic Sources

Nemotron-4-340B-Reward Alternatives