Google DeepMind's New Method GenRM Significantly Enhances AI Reasoning Capabilities and Boosts Accuracy

AIbase基地

Published inAI News · 5 min read · Sep 3, 2024

312

Recently, the research team at Google DeepMind, in collaboration with multiple universities, has proposed a new method called the Generative Reward Model (GenRM), aimed at enhancing the accuracy and reliability of generative AI in reasoning tasks.

Generative AI is widely applied in various fields, including natural language processing, primarily through predicting the next word in a sequence to generate coherent text. However, these models sometimes confidently output incorrect information, which is a significant issue especially in high-stakes fields like education, finance, and healthcare.

Currently, researchers have attempted various solutions to address the accuracy challenges faced by generative AI models. Discriminative Reward Models (RMs) have been used to judge the correctness of potential answers based on scores, but this approach fails to fully leverage the generative capabilities of large language models (LLMs). Another common method is "LLM as a Judge," but it often performs less effectively than specialized verifiers in solving complex reasoning tasks.

The innovation of GenRM lies in redefining the verification process as a next-word prediction task. Unlike traditional discriminative reward models, GenRM integrates the text generation capabilities of LLMs into the verification process, allowing the model to both generate and evaluate potential solutions. Additionally, GenRM supports Chain of Thought (CoT), enabling the model to generate intermediate reasoning steps before reaching a final conclusion, making the verification process more comprehensive and systematic.

By combining generation and verification, GenRM employs a unified training strategy, allowing the model to enhance both its generative and verification abilities during training. In practical applications, the model generates intermediate reasoning steps, which are used to validate the final answer.

Researchers have found that GenRM performs exceptionally well in rigorous tests, such as in pre-school mathematics and algorithmic problem-solving tasks, where GenRM's accuracy significantly improves. Compared to discriminative reward models and LLM as a Judge methods, GenRM's problem-solving success rate increases by 16% to 64%.

For instance, in verifying the output of the Gemini1.0Pro model, GenRM increased the problem-solving success rate from 73% to 92.8%.

The introduction of the GenRM method marks a significant advancement in the field of generative AI, by unifying solution generation and verification into a single process, significantly improving the accuracy and trustworthiness of AI-generated solutions.

Key Points:
1. 🌟 GenRM enhances generative AI's reasoning capabilities by redefining the verification process as a next-word prediction task.
2. 📈 GenRM outperforms traditional methods in multiple tests, with accuracy improvements of 16% to 64%.
3. 🧠 The method integrates generation and verification, enhancing the potential of AI applications in high-risk fields.

Generative Reward Model Generative AI Large Language Models Discriminative Reward Model

This article is from AIbase Daily

Welcome to the [AI Daily] column! This is your daily guide to exploring the world of artificial intelligence. Every day, we present you with hot topics in the AI field, focusing on developers, helping you understand technical trends, and learning about innovative AI product applications.

—— Created by the AIbase Daily Team

AI News Recommendations

Pixverse Launches MCP: Unlocking a New Frontier in AI Video Generation

With the rapid advancement of generative AI technology, the video creation field is experiencing a new wave of transformation. Pixverse, a leading platform in AI video generation, recently launched the Model Context Protocol (MCP), providing users and developers with a more efficient and flexible video generation solution. What is MCP? Unlocking new ways to generate AI videos. Pixverse's MCP (Model Context Protocol) is specifically designed for AI video generation...

Apr 25, 2025

110

AI Boosts UK Workplace Productivity: Employees Save 122 Hours Annually!

A new Google report reveals that effective AI training for employees could unlock a £400 billion (approximately $533 billion) boost to the UK economy from AI-driven growth. Based on a UK pilot program, the report shows employees saved over 122 hours annually on administrative tasks by using AI tools. The report highlights that simplifying AI usage and providing adequate training are key to wider AI adoption. Google's European, Middle...

Apr 25, 2025

110

Google I/O 2025 Outlook: Material 3, Android XR, and Generative AI Reshape Developer Experience

At this morning's Google I/O 2025 conference, Google announced a series of exciting new technologies, further showcasing its latest advancements in artificial intelligence, immersive experiences, and developer tools. Here are the major highlights we can expect: 1. Material 3 Expressive: The Future of Expressive Design. Google will unveil Material 3 Expressive at the conference, a new design system described as "the future of Google's user experience design." Material 3 Ex...

Apr 24, 2025

210

Zhipu Announces Price Cuts for Multiple Large Language Models, with GLM-4-Plus Dropping 90%

Zhipu BigModel's open platform has adjusted prices for several of its model offerings. GLM-4-FlashX, for example, is now priced at just 10 RMB per 100 million tokens. Built on a powerful pre-trained base, this model boasts exceptionally fast inference speeds and functional capabilities comparable to GPT-4, excelling in data extraction, generation, and translation.

Apr 24, 2025

180

ProGen3: A Generative AI Biomodel Redefining the Future of Protein Design

AI is revolutionizing life sciences. Biocomputing company ProFluent recently launched ProGen3, a powerful generative protein language model (PLM) poised to drive breakthroughs in antibodies, industrial enzymes, and gene editing. Research shows ProGen3's scale and design optimizations enable the generation of highly functional novel proteins, potentially reshaping our understanding of biology. Proteins are vital molecules within living organisms, responsible for diverse physiological functions, from catalyzing reactions to recognition.

Apr 22, 2025

310

JEDEC Releases HBM4 Standard, Powering the Next Era of AI and High-Performance Computing

The JEDEC Solid State Technology Association has announced the highly anticipated release of the High Bandwidth Memory (HBM) standard – HBM4. Evolving from the HBM3 standard, HBM4 aims to further accelerate data processing while maintaining higher bandwidth, energy efficiency, and greater capacity per chip or stack, to meet the demands of efficient processing of large datasets and complex computations. The HBM4 standard introduces several key technological advancements, suitable for applications in generative AI, high-performance computing, high-end graphics cards, and servers. Firstly, HBM4 significantly increases bandwidth...

Apr 22, 2025

150

Sand AI Open-Sources MAGI-1 Video Generation Model: Infinite Scalability, High Fidelity

On April 21, 2025, Sand AI open-sourced its video generation model, MAGI-1. With its innovative autoregressive diffusion architecture and exceptional performance, it quickly became a focal point in the generative AI field. Licensed under Apache 2.0, the code, weights, and inference tools are available on GitHub and Hugging Face, providing a powerful creative tool for global developers. MAGI-1 is based on a diffusion transformer architecture, incorporating block causal attention and parallel attention.

Apr 22, 2025

460

Vidu Q1 Officially Launched: Higher Definition, Smoother Frame Rates

Shengshu Technology has officially launched Vidu Q1, a high-performance generative AI video model. Its exceptional visual quality, smooth cinematic transitions, precise sound effects, and enhanced animation style have generated significant industry buzz. According to AIbase, Vidu Q1 surpasses existing competitors in the VBench comprehensive video generation evaluation standard. With comprehensive upgrades across four core functions, it provides creators with a production experience comparable to professional film studios. Project details have been released on the Vidu website and social media platforms, marking a significant advancement in AI video generation technology.

Apr 22, 2025

130

Intel Open-Sources AI Playground: Arc GPU-Powered Local AI Model Execution

Intel recently announced the open-sourcing of its AI Playground software, designed for local generative AI. AI Playground provides a powerful platform for running AI models on Intel Arc GPUs. It supports various image and video generation models, as well as Large Language Models (LLMs), significantly lowering the hardware barrier for AI applications by optimizing local computing resources. The project is available on GitHub and has attracted developers and AI enthusiasts worldwide.

Apr 21, 2025

210

Intel Open Sources AI Playground for Intel Arc GPUs and Various AI Models

Intel has announced the open-sourcing of its generative AI software, AI Playground, generating significant interest within the AI community. Optimized for Intel Arc GPUs and integrated graphics, AI Playground is described as an 'AI hub' that supports local running of chat-based Large Language Models (LLMs), as well as image and video generation capabilities. This open-sourcing signifies Intel's commitment to advancing the accessibility of generative AI technology.

Apr 21, 2025

180

AI News

AI Daily

AI Timeline

Al Hardware

Latest Cases

Image Collection

Video Collection

Audio Collection

Content Collection

Latest Tutorials

AI Product Ranking

AI Traffic Growth Ranking

AI Traffic Decline Ranking

AI Weekly Ranking

United States

China

India

Brazil

Image Generation

Personal Assistant

Character Generation

Video Generation

AI Project Ranking

AI Project Growth Ranking

AI Developer Ranking

AI Organization Ranking

Deepseek

TTS

LLM

ChatGPT

Overview

Google DeepMind's New Method GenRM Significantly Enhances AI Reasoning Capabilities and Boosts Accuracy

AIbase基地

This article is from AIbase Daily

AI News Recommendations

Pixverse Launches MCP: Unlocking a New Frontier in AI Video Generation

AI Boosts UK Workplace Productivity: Employees Save 122 Hours Annually!

Google I/O 2025 Outlook: Material 3, Android XR, and Generative AI Reshape Developer Experience

Zhipu Announces Price Cuts for Multiple Large Language Models, with GLM-4-Plus Dropping 90%

ProGen3: A Generative AI Biomodel Redefining the Future of Protein Design

JEDEC Releases HBM4 Standard, Powering the Next Era of AI and High-Performance Computing

Sand AI Open-Sources MAGI-1 Video Generation Model: Infinite Scalability, High Fidelity

Vidu Q1 Officially Launched: Higher Definition, Smoother Frame Rates

Intel Open-Sources AI Playground: Arc GPU-Powered Local AI Model Execution

Intel Open Sources AI Playground for Intel Arc GPUs and Various AI Models