CogView4: An Open-Source Text-to-Image Model Supporting Bilingual Prompts

AIbase基地

Published inAI News · 4 min read · Mar 4, 2025

CogView4, the latest open-source text-to-image model from Zhihu AI, has officially launched. Boasting 600 million parameters, CogView4 fully supports Chinese input and the generation of images from Chinese text, earning it the title of "the first open-source model capable of generating Chinese characters in images."

A core highlight of CogView4 is its support for bilingual (Chinese and English) prompts. It excels at understanding and following complex Chinese instructions, making it a boon for Chinese content creators. As the first open-source text-to-image model capable of generating Chinese characters within images, it fills a significant gap in the open-source landscape. Furthermore, the model supports generating images of arbitrary width and height and can handle prompts of any length, demonstrating exceptional flexibility.

CogView4's bilingual capabilities stem from a comprehensive upgrade to its technical architecture. Its text encoder has been upgraded to GLM-4, supporting both Chinese and English input, overcoming the previous limitation of open-source models only supporting English. Reportedly, the model was trained using bilingual (Chinese and English) image-text pairs to ensure high-quality generation in Chinese contexts.

In text processing, CogView4 abandons the traditional fixed-length design, adopting a dynamic text length scheme. With an average descriptive text of 200-300 tokens, redundancy is reduced by approximately 50% compared to the traditional fixed 512-token scheme, improving training efficiency by 5%-30%. This innovation not only optimizes computing resources but also allows the model to process prompts of varying lengths more efficiently.

CogView4's ability to generate images of arbitrary resolution is underpinned by several technological breakthroughs. The model employs mixed-resolution training, combined with two-dimensional rotational positional encoding and interpolated positional representation, to adapt to different size requirements. Furthermore, its use of a Flow-matching diffusion model and parameterized linear dynamic noise scheduling further enhances the quality and diversity of generated images.

The CogView4 training process is divided into several stages: starting with basic resolution training, progressing to pan-resolution adaptation, then high-quality data fine-tuning, and finally optimizing output through human preference alignment. This process retains the Share-param DiT architecture while introducing independent adaptive layer normalization for different modalities, ensuring the model's stability and consistency across various tasks.

Project: https://github.com/THUDM/CogView4

open-source text-to-image AI CogView4 bilingual Chinese English

This article is from AIbase Daily

Welcome to the [AI Daily] column! This is your daily guide to exploring the world of artificial intelligence. Every day, we present you with hot topics in the AI field, focusing on developers, helping you understand technical trends, and learning about innovative AI product applications.

—— Created by the AIbase Daily Team

AI News Recommendations

Shanghai Promotes Innovation and Upgrading of the Automotive Industry, Strengthening Applications of Innovative Technologies such as High-Performance Computing Chips and Intelligent Driving Large Models

Shanghai is actively promoting innovation and upgrading within its automotive industry, focusing on the application of advanced technologies including high-performance computing chips and intelligent driving large models. This initiative aims to enhance the competitiveness and technological advancement of the city's automotive sector.

Apr 25, 2025

Razer Launches Pro Click V2 Series Ergonomic Mice: A New AI-Powered Experience

On April 25th, Razer officially launched its new Pro Click V2 series of ergonomic mice, including the standard Pro Click V2 and the new vertical Pro Click V2 Vertical Edition, starting at 799 yuan. These mice not only combine comfort and functionality in their design but also prioritize user experience. The Pro Click V2 features a classic 6-button design with a central scroll wheel...

Apr 25, 2025

100

Manus Completes $75 Million Funding Round, Valuation Soars to Nearly $500 Million

According to Bloomberg, Manus AI, a general-purpose AI agent from Chinese startup Butterfly Effect, has secured $75 million in funding. This translates to approximately 547 million RMB and was led by Benchmark, a prominent US venture capital firm, with participation from existing investors. This round catapults Butterfly Effect's valuation to nearly $500 million, a fourfold increase, reflecting strong market confidence in its product. Image note: Image generated by AI, image rights reserved.

Apr 25, 2025

DeepMind Releases Lyria2 Music Generation Model, Revolutionizing AI Music Creation

Google DeepMind has officially released its latest music generation model, Lyria2, marking another significant breakthrough in artificial intelligence for music creation. This new model, with its high-fidelity audio generation and professional-grade sound quality, provides musicians, producers, and creators with more powerful creative tools. Lyria2: High-fidelity sound quality, capturing the subtle beauty of music. Lyria2 represents DeepMind's latest advancement in music generation technology, offering significant improvements in sound quality and creative flexibility compared to previous models.

Apr 25, 2025

Global Premiere of AI-Generated Feature Film "Queen of the Seas: Zheng Yi Sao" Premieres: A Collision of Technology and Art

In today's film industry, constantly exploring new technologies, the world's first government-approved AI-generated feature film, "Queen of the Seas: Zheng Yi Sao", officially premiered in Singapore on April 24th. This 70-minute film breaks the boundaries of traditional filmmaking, showcasing the limitless potential of artificial intelligence in film creation through its complete storytelling and diverse characters. "Queen of the Seas: Zheng Yi Sao" is based on the historical legendary female pirate Zheng Yi Sao, who once commanded approximately 1800 ships and led 100,000 pirates. Led by the creative team FizzDragon, the film...

Apr 25, 2025

Ant Group Launches Plan A: A Global Recruitment Drive for Top AI Talent

Ant Group has announced Plan A, a dedicated AI talent recruitment initiative aimed at attracting top-tier artificial intelligence researchers worldwide. Building upon Ant Group's existing "Ant Star" campus talent program, Plan A seeks to attract outstanding graduates from leading global universities to join Ant Group's exploration of Artificial General Intelligence (AGI). The recruitment targets individuals with backgrounds in computer science, software engineering, artificial intelligence, cybersecurity, information and communication engineering, mathematics, and statistics.

Apr 25, 2025

Google CEO Pichai Reveals: Over 30% of Code is AI-Generated

During Alphabet's recent Q1 2025 earnings call, Google CEO Sundar Pichai revealed that over 30% of Google's code is currently generated with the help of artificial intelligence (AI). This means that for every three code changes, developers accept an AI suggestion once. Pichai noted that AI-assisted programming is gaining strong momentum across teams with the introduction of more powerful models and proactive workflows. Proactive workflows refer to AI systems capable of planning and executing multi-step tasks. He stated:

Apr 25, 2025

Wise芽 Launches Eureka AI Agent Platform to Boost Technological Innovation Efficiency

Wise芽 has officially launched its new AI Agent platform, Eureka. This platform focuses on providing intelligent services for intellectual property, R&D, biomedicine, materials, and technological innovation, aiming to help users complete technological innovation work more efficiently. The Eureka platform initially launched nearly 20 specialized AI agents, covering various functions such as novelty search, patent specification writing, technical solution exploration, technical Q&A, biomedical encyclopedia Q&A, and material property analysis.

Apr 25, 2025

AI Daily: Baidu Unveils Wenxin Large Model X1Turbo and AI Open Program; OpenAI Offers Free Lightweight Deep Research; iDream Video 3.0 Internal Testing

Baidu released its new Wenxin large language model X1Turbo and an accompanying AI open program. OpenAI is offering a free, lightweight version of its Deep Research platform. iDream Video 3.0 is currently undergoing internal testing.

Apr 25, 2025

Li Yanhong Discusses DeepSeek's Current Pain Points: 'DeepSeek is Slow and Expensive'

Today, at the Create 2025 AI Developer Conference held in Wuhan, Baidu founder Li Yanhong delivered a nearly 60-minute keynote speech themed "The World of Models, the Realm of Applications." He officially released Ernie 4.5 Turbo and X1 Turbo, and revealed the progress and challenges of DeepSeek's implementation within the Baidu ecosystem.

Apr 25, 2025

150

AI News

AI Daily

AI Timeline

Al Hardware

Latest Cases

Image Collection

Video Collection

Audio Collection

Content Collection

Latest Tutorials

AI Product Ranking

AI Traffic Growth Ranking

AI Traffic Decline Ranking

AI Weekly Ranking

United States

China

India

Brazil

Image Generation

Personal Assistant

Character Generation

Video Generation

AI Project Ranking

AI Project Growth Ranking

AI Developer Ranking

AI Organization Ranking

Deepseek

TTS

LLM

ChatGPT

Overview

CogView4: An Open-Source Text-to-Image Model Supporting Bilingual Prompts

AIbase基地

This article is from AIbase Daily

AI News Recommendations

Shanghai Promotes Innovation and Upgrading of the Automotive Industry, Strengthening Applications of Innovative Technologies such as High-Performance Computing Chips and Intelligent Driving Large Models

Razer Launches Pro Click V2 Series Ergonomic Mice: A New AI-Powered Experience

Manus Completes $75 Million Funding Round, Valuation Soars to Nearly $500 Million

DeepMind Releases Lyria2 Music Generation Model, Revolutionizing AI Music Creation

Global Premiere of AI-Generated Feature Film "Queen of the Seas: Zheng Yi Sao" Premieres: A Collision of Technology and Art

Ant Group Launches Plan A: A Global Recruitment Drive for Top AI Talent

Google CEO Pichai Reveals: Over 30% of Code is AI-Generated

Wise芽 Launches Eureka AI Agent Platform to Boost Technological Innovation Efficiency

AI Daily: Baidu Unveils Wenxin Large Model X1Turbo and AI Open Program; OpenAI Offers Free Lightweight Deep Research; iDream Video 3.0 Internal Testing

Li Yanhong Discusses DeepSeek's Current Pain Points: 'DeepSeek is Slow and Expensive'