Llama3-70B-SteerLM-RM is a 70-billion parameter language model that serves as a property prediction model, specifically a multi-faceted reward model. It evaluates model responses across multiple dimensions instead of relying on a single score, unlike traditional reward models. This model was trained on the HelpSteer2 dataset and utilizes NVIDIA NeMo-Aligner, an scalable toolkit for efficient and effective model alignment.