AI News

Don't miss any moment of global AI innovation

AI Daily

Daily three-minute AI industry trends

AI Timeline

AI industry milestones

Al Hardware

Lists all AI hardware products.

AI Monetization Guide

Latest Cases

AI monetization case sharing

Image Collection

AI image creation monetization cases

Video Collection

AI video creation monetization cases

Audio Collection

AI audio creation monetization cases

Content Collection

AI content writing monetization cases

AI Tutorials

Latest Tutorials

Free sharing of the latest AI tutorials

AI Product Rankings

AI Product Ranking

Shows total visits ranking of AI websites

AI Traffic Growth Ranking

Track fastest growing AI websites by traffic

AI Traffic Decline Ranking

Focus on AI websites with significant traffic drops

AI Weekly Ranking

Shows weekly visits ranking of AI websites

Popular Country Rankings

United States

AI websites most popular with US users

China

AI websites most popular with Chinese users

India

AI websites most popular with Indian users

Brazil

AI websites most popular with Brazilian users

Popular Category Rankings

Image Generation

Total visits ranking of AI image generation websites

Personal Assistant

Total visits ranking of AI personal assistant websites

Character Generation

Total visits ranking of AI character generation websites

Video Generation

Total visits ranking of AI video generation websites

Popular Open Source Data Rankings

AI Project Ranking

GitHub popular AI projects by total stars

AI Project Growth Ranking

GitHub popular AI projects by growth rate

AI Developer Ranking

GitHub popular AI developer ranking

AI Organization Ranking

GitHub popular AI organization ranking

Popular Open Source Categories

Deepseek

GitHub popular deepseek open source projects

TTS

GitHub popular TTS open source projects

LLM

GitHub popular LLM open source projects

ChatGPT

GitHub popular ChatGPT open source projects

AI Open Source Project Library

Overview

Overview of GitHub popular AI open source projects

Product Library Tool Navigation

New Framework d1 Advances Diffusion Model Inference, Sparking a New Wave of Reinforcement Learning Applications

AIbase基地

Published inAI News · 4 min read · Apr 21, 2025

In the continuous advancement of artificial intelligence, diffusion models are increasingly demonstrating remarkable reasoning capabilities, surpassing their previous role as mere followers of autoregressive models. Researchers from UCLA and Meta have recently introduced a novel framework called d1, which combines supervised fine-tuning (SFT) and reinforcement learning (RL) to significantly enhance the reasoning abilities of diffusion models, encompassing mathematical understanding and logical reasoning.

This innovative d1 framework employs a two-stage post-training strategy to improve the performance of masked large language models (dLLMs). In the first stage, the model undergoes supervised fine-tuning using high-quality reasoning trajectories, acquiring foundational knowledge and logical reasoning skills. Subsequently, in the second stage, the researchers introduce a novel policy gradient method called diffu-GRPO, specifically optimized for masked dLLMs, significantly enhancing reasoning efficiency.

Compared to previous research, d1 aims to address the challenges encountered in reinforcement learning post-training for diffusion models. Traditional autoregressive models optimize model output by calculating the log probability of the generated sequence, while dLLMs face computational difficulties due to their iterative generation nature. To overcome this, the research team developed an efficient log probability estimator that independently calculates the probability of each token, drastically reducing computation time and improving training efficiency.

In experiments, the researchers used LLaDA-8B-Instruct as the base model and compared d1-LLaDA with models trained using only SFT or diffu-GRPO. Results demonstrate that d1-LLaDA significantly outperforms the base model and single-method approaches across various mathematical and logical reasoning tests. This combined approach not only enhances the model's reasoning capabilities but also showcases a positive synergistic effect.

With the introduction of the d1 framework, the performance of diffusion models in reasoning tasks is poised for a significant leap, opening up vast avenues for future research. The researchers believe this innovative framework will propel the further development of language models, facilitating more complex reasoning and logical tasks.

Project Address: https://dllm-reasoning.github.io/

ExpansionModel d1Framework MaskedLanguageModel(LLM)SupervisedFine-Tuning(SFT)

This article is from AIbase Daily

Welcome to the [AI Daily] column! This is your daily guide to exploring the world of artificial intelligence. Every day, we present you with hot topics in the AI field, focusing on developers, helping you understand technical trends, and learning about innovative AI product applications.

—— Created by the AIbase Daily Team

AI News Recommendations

Light-A-Video: Video Re-Illumination Without Training

Feb 17, 2025

2.0k

DiffSensei: An AI System that Automatically Converts Written Stories into Comic Style

Jan 3, 2025

5.5k

CAP4D: Generate High-Quality 4D Character Avatars by Uploading Reference Images

Dec 23, 2024

5.1k

Shopping Frenzy! AI Fitting Technology Fashion-VDM Disrupts Traditional Online Shopping, No More Fears of Bad Purchases!

Nov 13, 2024

7.7k

REPA technology boosts the training speed of AI image generation models by 17.5 times

Oct 16, 2024

1.3k

The Stunning Debut of AI Inverse Painting Technology: One-Click Repainting of Monet and Van Gogh Masterpieces

Oct 11, 2024

7.0k