Can Diffusion Models Play Games? DIAMOND Excels in Atari, Visual Details are Key!

AIbase基地

Published inAI News · 4 min read · Nov 18, 2024

168

Reinforcement learning has achieved many successes in recent years, but its low sample efficiency limits its application in the real world. World models, as a type of environment generation model, offer hope in addressing this issue. They can serve as simulated environments to train reinforcement learning agents with higher sample efficiency.

Currently, most world models simulate environmental dynamics using discrete latent variable sequences. However, this method of compressing into a compact discrete representation may overlook visual details that are crucial for reinforcement learning.

Meanwhile, diffusion models have become the dominant approach in the field of image generation, challenging traditional discrete latent variable modeling methods. Inspired by this, researchers proposed a new method called DIAMOND (Dreaming in Environments with Diffusion Models), which is a reinforcement learning agent trained within a diffusion world model. DIAMOND makes key design choices to ensure the efficiency and stability of the diffusion model over long time horizons.

DIAMOND achieved an average human-normalized score of 1.46 on the renowned Atari100k benchmark, marking the best performance for agents trained entirely within a world model. Additionally, the advantage of operating in image space allows the diffusion world model to directly replace the environment, leading to a better understanding of the behaviors of the world model and the agent. Researchers found that the performance improvements in certain games stemmed from better modeling of critical visual details.

The success of DIAMOND is attributed to the choice of the EDM (Elucidating the Design Space of Diffusion-based Generative Models) framework. Compared to traditional DDPM (Denoising Diffusion Probabilistic Models), EDM exhibits higher stability with fewer denoising steps, avoiding severe cumulative errors in the model over long time ranges.

Furthermore, DIAMOND demonstrated its capability to function as an interactive neural game engine through training on 87 hours of static data from Counter-Strike: Global Offensive, successfully generating an interactive Dust II map neural game engine.

In the future, DIAMOND could further enhance its performance by integrating more advanced memory mechanisms, such as autoregressive Transformers. Additionally, incorporating reward/termination prediction into the diffusion model is also a promising direction for exploration.

Paper link: https://arxiv.org/pdf/2405.12399

Amazon Plans to Increase Investment in Anthropic and Build the World's Largest Data Center Together!

Amazon plans additional investment in AI startup Anthropic to strengthen their partnership. After investing $8B, the new round could make Amazon a major shareholder. They will collaborate on the world's largest data center project to provide computing power for Anthropic and sell its tech to AWS customers. Anthropic, founded by ex-OpenAI employees, competes with ChatGPT via its Claude model. Amazon also aims to invest $11B in Indiana data centers....

AI Daily: xAI Shockingly Launches Grok4; Microsoft Opensources New Phi-4-mini Version; Shanghai has Cumulatively 82 Large Models Passed Filing

1. xAI launches Grok4 with enhanced math/coding capabilities; 2. Microsoft open-sources efficient Phi-4-mini for edge devices; 3. Shanghai approves 82 specialized AI models; 4. Hugging Face releases Reachy Mini robot; 5. Perplexity debuts Comet AI browser; 6. OpenAI plans first open-weight model; 7. Google releases GPU-friendly MedGemma; 8. OpenAI acquires AI hardware firm for $6.5B.....

Shanghai has completed the filing of 82 large models

At the 2025 World Artificial Intelligence Conference, it was revealed that Shanghai has filed 82 large models and is actively promoting AI demonstration applications in key industries such as manufacturing and finance. Xuhui Moshu Space and Pudong Moli Community have become industrial carriers, gathering 500 and 200 AI companies respectively. Shanghai has established a full-cycle financing support system from the early stages to the mature stage through national and municipal artificial intelligence funds, with a focus on key areas such as computing power and language data.

Aliyun Open-Sources Network Agent WebSailor, Surpassing Numerous Closed-Source Models

Aliyun open-sources the network agent WebSailor. Its 32B and 72B versions performed well in the BrowseComp evaluation, surpassing multiple closed-source models, ranking just behind OpenAI DeepResearch. The project has been released on GitHub with construction plans and datasets, promoting open innovation in the AI field and providing developers with a smarter web interaction tool.

Product Finder

Product Submit

AI Models Finder

MCP Servers

MCP Client

MCP Inspector

Case Tutorials

Latest AI News

AI Daily Brief

Can Diffusion Models Play Games? DIAMOND Excels in Atari, Visual Details are Key!

AIbase基地

This article is from AIbase Daily

AI News Recommendations

Personification of Large AI Models: Grok 4 and Empathy with Musk?

Amazon Plans to Increase Investment in Anthropic and Build the World's Largest Data Center Together!

AI Daily: xAI Shockingly Launches Grok4; Microsoft Opensources New Phi-4-mini Version; Shanghai has Cumulatively 82 Large Models Passed Filing

Shanghai has completed the filing of 82 large models

OpenAI Plans to Release Open-Weight Models, Breaking the Closed-Source Convention

NVIDIA Collaborates with Hong Kong University and Others to Launch Fast KV Cache, Aiding in Accelerating Diffusion Models

New Breakthrough in Cyclic Models: 500 Steps of Training Makes Ultra-Long Sequences No Longer Difficult!

Aliyun Open-Sources Network Agent WebSailor, Surpassing Numerous Closed-Source Models

Musk Announces Live Broadcast Launch of Grok 4 on July 10! AI Large Models Spark Discussion

The Design World's Google Docs is About to Go Public: Why is Figma Worried About AI?