ByteDance Releases Infinity: A New Breakthrough in Auto-Regressive Text-to-Image Generation, Outperforming Diffusion Models

AIbase基地

Published inAI News · 4 min read · Jan 3, 2025

718

In the field of artificial intelligence, ByteDance's commercialization technology team has introduced the latest achievement, the Infinity model, which has become the new king in the autoregressive text-to-image domain with its outstanding performance and innovative technology. This newly open-sourced model not only surpasses Stable Diffusion 3 in image generation quality but also demonstrates significant advantages in inference speed.

The core innovation of the Infinity model lies in its use of a Bitwise Token autoregressive framework. This framework enhances the model's ability to capture high-frequency signals by predicting the next resolution as fine-grained "Bitwise Tokens" composed of +1 or -1, resulting in images with richer details. Additionally, the Infinity model expands the vocabulary to infinity, greatly enhancing the representation space of the image tokenizer and raising the performance ceiling of autoregressive text-to-image generation.

In performance comparisons, the Infinity model stands out among autoregressive methods, significantly outperforming methods like HART, LlamaGen, and Emu3, and defeating the HART model with nearly a 90% win rate in human evaluations. Furthermore, Infinity achieved win rates of 75%, 80%, and 65% against SOTA diffusion models such as PixArt-Sigma, SD-XL, and SD3-Medium, proving its advantages among models of the same size.

Another major feature of the Infinity model is its excellent scaling properties. As the model size increases and more training resources are invested, the validation set loss steadily decreases while the validation set accuracy consistently improves. Additionally, Infinity introduces a bitwise self-correction technique that enhances the model's self-correction capability, alleviating the cumulative error issue during autoregressive inference.

In terms of inference speed, Infinity inherits the speed advantage of VAR, generating a 1024x1024 image with the 2B model in just 0.8 seconds, which is three times faster than the same size SD3-Medium and fourteen times faster than the 12B Flux Dev. The 8B model is seven times faster than the same size SD3.5, and the 20B model takes 3 seconds to generate a 1024x1024 image, nearly four times faster than the 12B Flux Dev.

Currently, the training and inference code, demo, and model weights for the Infinity model are available on GitHub, along with a website experience for users to try out and evaluate the model's performance.

Project page: https://foundationvision.github.io/infinity.project/

Li Auto's Self-Developed AI Chip M100 Exposed, Performance Three Times Higher Compared to High-End Models

Li Auto's Q3 2025 financial report shows total revenue of 27.4 billion yuan, a 36.2% year-over-year decline; net loss of 624.4 million yuan, compared to a profit of 2.8 billion yuan in the same period last year. In a conference call, management emphasized that the company is accelerating its transformation in autonomous driving and AI fields. The self-developed AI inference chip M100 has made significant progress, indicating future strategic adjustments.

ByteDance PICO Strategic Upgrade: Launch Self-Developed Chip and New VR Headset in 2026

ByteDance accelerates the self-development and high-end positioning of VR hardware. The PICO brand under ByteDance plans to launch a new generation of headsets in 2026, equipped with a fully self-developed dedicated chip. This chip was initiated in 2022, completed the first chip return in 2024, and entered mass production. It meets the performance targets, with its core advantage being low latency performance.

Meta Launches SPICE Framework to Enable AI Systems to Develop Self-Learning Reasoning Capabilities

Meta, in collaboration with the National University of Singapore, has developed the SPICE reinforcement learning framework, which enables two AI agents to compete against each other and improve their capabilities without human supervision. The framework is still in the proof-of-concept stage and has the potential to lay the foundation for future AI systems that can dynamically adapt to their environments, enhancing their robustness in dealing with the unpredictability of the real world.

Anthropic's Major Research: Claude Can Detect and Regulate Internal Thoughts, Early Signs of Self-Reflection!

A new study by Anthropic shows that its AI model Claude Opus4.1 has developed preliminary self-reflection ability, enabling it to recognize and regulate its own thought processes. This marks a step forward for artificial intelligence from black-box output to perceivable internal cognitive processes. Although it has not reached consciousness, it has broken through an important threshold in the philosophy of technology.

Latest AI News

AI Daily Brief

AI Product Finder

AI Product Rankings

AI Product Submit

AI Tools Directory

AI Models Finder

LLM Leaderboard

Model Providers

Compare LLMs

LLM Cost Calculator

LLM Arena

MCP Servers

MCP Client

MCP Case Tutorials

MCP Ranking

MCP Service Submission

MCP Playground

MCP Inspector

AI Brand Monitoring Tool

AI Search Visibility Checker

GEO Services

AI Model Compatibility Checker

AI Deployment Calculator

ByteDance Releases Infinity: A New Breakthrough in Auto-Regressive Text-to-Image Generation, Outperforming Diffusion Models

AIbase基地

This article is from AIbase Daily

AI News Recommendations

Domestic Mathematical Gold Medal Emerges: DeepSeek-Math-V2 Open-Source File Has Been Uploaded, Performance Competes with GPT-4o

Li Auto's Self-Developed AI Chip M100 Exposed, Performance Three Times Higher Compared to High-End Models

ByteDance PICO Strategic Upgrade: Launch Self-Developed Chip and New VR Headset in 2026

Meta Launches SPICE Framework to Enable AI Systems to Develop Self-Learning Reasoning Capabilities

The New Era of AI Has Arrived! Yangcong Academy Launches the Self-Learning Breakthrough Plan to Open a New Chapter in Autonomous Learning

Grab's Self-Developed Language Model Solves Asian Language Recognition Challenges

Zhiyuan Launches Emu3.5 Large Model: Reconstructing Multimodal Intelligence with Next-State Prediction, Embodied Operational Capabilities Amaze the Industry

Anthropic's Major Research: Claude Can Detect and Regulate Internal Thoughts, Early Signs of Self-Reflection!

Cursor 2.0 Makes a Stunning Debut! Self-Developed Model Composer is 4 Times Faster, 8 AI Agents Work in Parallel for Coding, Developer Efficiency Sees a Nuclear-Level Upgrade

ByteDance Engine Unveils AI Governance Sword: Self-Developed Multimodal Large Model Can Review 90% of Ads in 10 Minutes, Intercepting 840,000 Violating Materials in a Quarter

Latest AI News

AI Daily Brief

AI Product Finder

AI Product Rankings

AI Product Submit

AI Tools Directory

AI Models Finder

LLM Leaderboard

Model Providers

Compare LLMs

LLM Cost Calculator

LLM Arena

MCP Servers

MCP Client

MCP Case Tutorials

MCP Ranking

MCP Service Submission

MCP Playground

MCP Inspector

AI Brand Monitoring Tool

AI Search Visibility Checker

GEO Services​

AI Model Compatibility Checker

AI Deployment Calculator

ByteDance Releases Infinity: A New Breakthrough in Auto-Regressive Text-to-Image Generation, Outperforming Diffusion Models

AIbase基地

This article is from AIbase Daily

AI News Recommendations

Domestic Mathematical Gold Medal Emerges: DeepSeek-Math-V2 Open-Source File Has Been Uploaded, Performance Competes with GPT-4o

Li Auto's Self-Developed AI Chip M100 Exposed, Performance Three Times Higher Compared to High-End Models

ByteDance PICO Strategic Upgrade: Launch Self-Developed Chip and New VR Headset in 2026

Meta Launches SPICE Framework to Enable AI Systems to Develop Self-Learning Reasoning Capabilities

The New Era of AI Has Arrived! Yangcong Academy Launches the Self-Learning Breakthrough Plan to Open a New Chapter in Autonomous Learning

Grab's Self-Developed Language Model Solves Asian Language Recognition Challenges

Zhiyuan Launches Emu3.5 Large Model: Reconstructing Multimodal Intelligence with Next-State Prediction, Embodied Operational Capabilities Amaze the Industry

Anthropic's Major Research: Claude Can Detect and Regulate Internal Thoughts, Early Signs of Self-Reflection!

Cursor 2.0 Makes a Stunning Debut! Self-Developed Model Composer is 4 Times Faster, 8 AI Agents Work in Parallel for Coding, Developer Efficiency Sees a Nuclear-Level Upgrade

ByteDance Engine Unveils AI Governance Sword: Self-Developed Multimodal Large Model Can Review 90% of Ads in 10 Minutes, Intercepting 840,000 Violating Materials in a Quarter

GEO Services