Apple's MDM Large Model for Image Generation Unveiled, Supporting High-Resolution Image Generation

机器之心

Published inAI News · 2 min read · Oct 31, 2023

Apple researchers have recently introduced a Matryoshka-style diffusion model known as MDM, capable of generating high-quality images with a resolution of 1024x1024 in an end-to-end manner. The innovation of MDM lies in the incorporation of a multi-resolution diffusion process, achieved through a nested UNet architecture that implements multi-resolution loss, significantly enhancing the convergence speed for denoising high-resolution inputs. Additionally, MDM employs progressive training, starting from low resolutions and gradually incorporating higher resolution inputs and outputs, greatly improving training efficiency. Despite the relatively small training dataset, MDM has demonstrated formidable capabilities in generating high-quality, high-resolution images and videos. Compared to other cascade or latent methods, MDM offers simpler and more efficient training and inference processes.

Image Generation Diffusion Model High Resolution

This article is from AIbase Daily

Welcome to the [AI Daily] column! This is your daily guide to exploring the world of artificial intelligence. Every day, we present you with hot topics in the AI field, focusing on developers, helping you understand technical trends, and learning about innovative AI product applications.

—— Created by the AIbase Daily Team

AI News Recommendations

Open-Source Revolution! Step1X-Edit Lands on Hugging Face, Generating Images with Natural Language, Rivaling GPT-4o!

Step1X-Edit, a groundbreaking open-source AI model, has arrived on Hugging Face. This powerful tool allows users to create images using natural language descriptions, demonstrating performance comparable to GPT-4o. This release marks a significant advancement in accessible AI image generation technology.

Apr 28, 2025

120

GPT-4o's Image Generation Integrated into GPTs: A New Era of Personalized Image Bots

OpenAI has announced the official integration of GPT-4o's image generation capabilities into the GPTs (custom GPT) platform, providing developers and creators with powerful tools to build personalized image generation robots. According to AIbase, this update allows users to create custom image generation applications through GPTs, such as poster design robots or generators for specific artistic styles, significantly enhancing creative flexibility and sharing. The enthusiastic discussions on social media highlight its widespread impact; the feature is already available to ChatGPT Plus and P users.

Apr 27, 2025

150

GPT-4's Image Generation Capabilities Now Integrated into Custom GPTs

Apr 27, 2025

280

3DV-TON: Revolutionary Video Try-on with Diffusion Model Driven Texture 3D Consistency

3DV-TON (Textured 3D-Guided Consistent Video Try-on via Diffusion Models) is an innovative technology that delivers a realistic video try-on experience with texture 3D guidance using diffusion models. According to AIbase, 3DV-TON leverages advanced 3D geometry and texture modeling combined with video diffusion models to ensure consistency and realism of clothing in dynamic videos, offering significant potential for e-commerce, fashion, and virtual reality applications.

Apr 25, 2025

460

ImageSlider 2.0 Joining Core Product Line; Image Generation Capabilities Significantly Upgraded

Apr 25, 2025

250

Adobe Firefly, the AI Image Generator, Coming to iOS and Android

In the latest development, Adobe announced the upcoming release of mobile versions of its AI image generation tool, Firefly, aiming for a more intense competition with OpenAI. The news was officially revealed at the MAX Creativity Conference in London. Adobe stated that the Firefly mobile application will be available soon on iOS and Android platforms, though a specific release date is yet to be determined. Alexandru Co, Adobe Firefly's Vice President...

Apr 25, 2025

140

Jimeng AI 3.0 Global Launch: Cinematic Visuals and Precise English Typography Lead the Way in AI Creation

ByteDance's Jimeng AI has officially launched Jimeng AI 3.0 globally, marking a significant expansion of its text-to-image and video generation technology into international markets. According to AIbase, the new version boasts cinematic image quality, 2K resolution output, hyperrealistic textures, and precise English typography, particularly excelling in English text generation and font control, surpassing the performance of the previous Chinese version. The launch announcement has generated significant buzz on social media platforms. Features can be experienced via the Jimeng AI website and mobile application.

Apr 24, 2025

720

AI Daily: OpenAI Launches gpt-image-1 Image Generation API; Nano AI Releases MCP Universal Toolbox; China Accounts for 60% of Global AI Patents

Apr 24, 2025

230

JSON Visuals for ChatGPT Released: Unlock Infinite Image Style Creation

JSON Visuals for ChatGPT is officially released, injecting a new creative dimension into ChatGPT's image generation capabilities. According to AIbase, this tool offers over 50 unique aesthetic codes, combined with an attribute randomizer, to generate an infinite number of style combinations. Users simply input an image and JSON style code to create personalized visual content. The release announcement has sparked enthusiastic responses on social media, with the community particularly praising its surreal tech style. Core features: Flexible style generation and randomized JSON combinations.

Apr 24, 2025

860

OpenAI Releases gpt-image-1 API: 4o Image Generation Capabilities Now Open

OpenAI has officially launched the gpt-image-1 API, marking the opening of its highly anticipated 4o image generation capabilities to developers. According to AIbase, this API is lauded by the community as the world's strongest 'image generation' tool due to its high-fidelity image generation, diverse visual styles, and powerful integration of world knowledge. The release announcement has generated significant excitement among AI developers and the creative community, with relevant documentation now publicly available via the OpenAI website and Playground platform. Core features: High-fidelity and diverse style generation

Apr 24, 2025

510

AI News

AI Daily

AI Timeline

Al Hardware

Latest Cases

Image Collection

Video Collection

Audio Collection

Content Collection

Latest Tutorials

AI Product Ranking

AI Traffic Growth Ranking

AI Traffic Decline Ranking

AI Weekly Ranking

United States

China

India

Brazil

Image Generation

Personal Assistant

Character Generation

Video Generation

AI Project Ranking

AI Project Growth Ranking

AI Developer Ranking

AI Organization Ranking

Deepseek

TTS

LLM

ChatGPT

Overview

Apple's MDM Large Model for Image Generation Unveiled, Supporting High-Resolution Image Generation

机器之心

This article is from AIbase Daily

AI News Recommendations

Open-Source Revolution! Step1X-Edit Lands on Hugging Face, Generating Images with Natural Language, Rivaling GPT-4o!

GPT-4o's Image Generation Integrated into GPTs: A New Era of Personalized Image Bots

GPT-4's Image Generation Capabilities Now Integrated into Custom GPTs

3DV-TON: Revolutionary Video Try-on with Diffusion Model Driven Texture 3D Consistency

ImageSlider 2.0 Joining Core Product Line; Image Generation Capabilities Significantly Upgraded

Adobe Firefly, the AI Image Generator, Coming to iOS and Android

Jimeng AI 3.0 Global Launch: Cinematic Visuals and Precise English Typography Lead the Way in AI Creation

AI Daily: OpenAI Launches gpt-image-1 Image Generation API; Nano AI Releases MCP Universal Toolbox; China Accounts for 60% of Global AI Patents

JSON Visuals for ChatGPT Released: Unlock Infinite Image Style Creation

OpenAI Releases gpt-image-1 API: 4o Image Generation Capabilities Now Open