Text-to-Image Generation is Here! Zhichu's Open Source Image Generation Model CogView3 is 10 Times Faster than SDXL!

AIbase基地

Published inAI News · 4 min read · Sep 30, 2024

1.0k

Recently, Zhipu AI has open-sourced its latest creation to the public—CogView3 and its upgraded version CogView-3Plus-3B, injecting new vitality into the text-to-image generation field.

The debut of CogView3 is undoubtedly a significant milestone. As the first model to achieve relay diffusion in the field of text-to-image generation, it employs a unique cascaded diffusion approach. This innovative method first generates low-resolution images, then completes the final output through relay-based super-resolution technology. This not only greatly enhances the quality of the generated images but also significantly reduces the costs of training and inference.

Most notably, CogView3's performance is outstanding. According to human evaluations, CogView3 surpasses the current leading open-source text-to-image model SDXL in generation quality, with a win rate as high as 77.0%. Even more astonishing is that it achieves this feat in about half the inference time of SDXL. If using the streamlined version of CogView3, it can still maintain comparable performance levels while only taking about one-tenth of SDXL's inference time. This breakthrough undoubtedly opens up new possibilities for efficient and high-quality image generation.

In the meantime, Zhipu AI has also introduced CogView-3Plus-3B, an image model based on the DiT (Diffusion Transformers) framework. Although specific test results have not yet been released, the industry is full of anticipation for its potential. CogView-3Plus-3B has been further optimized on the basis of CogView3, introducing advanced technologies such as Zero-SNR diffusion noise scheduling and joint text-image attention mechanisms. These improvements not only reduce training and inference costs but also maintain strong image generation capabilities.

It is worth mentioning that CogView-3Plus-3B supports a wide range of image resolutions, from 512x512 to 2048x2048, greatly increasing its application flexibility. Whether for daily use or professional creation, suitable resolution options can be found.

To help users better utilize these models, Zhipu AI also provides practical advice and tools. They recommend users to optimize prompts through large language models (LLM), which can significantly enhance the quality of generated images. At the same time, Zhipu AI provides example scripts, greatly lowering the user entry barrier.

Project link: https://github.com/THUDM/CogView3

Alibaba Launches AI Development Tool Meoo to Empower Zero-Barrier Creative Monetization!

Alibaba Group's ATH business group launched the AI development tool Meoo (Miao Wu), integrating four models: Qwen, Kimi, GLM, and MiniMax. It supports users in turning ideas into products with zero barriers. Users only need to describe their ideas in natural language, and a website or H5 page can be generated in one minute and deployed on Aliyun with one click.

AI Daily: Chrome Launches Skill Library for Gemini; Xiaohongshu Opensources Training Engine Relax; Midjourney V8.1 Released

Welcome to the [AI Daily] segment! This is your daily guide to exploring the world of artificial intelligence. Every day, we present you with the latest content in the AI field, focusing on developers, helping you understand technical trends and innovative AI product applications. Discover new AI products: https://app.aibase.com/zh1. 'Chrome Launches Skill Library for Gemini': One-click reuse of prompts, say goodbye to repetitive input. Chrome introduced a new 'Skill Library' feature for Gemini, allowing users to reuse prompts.

Midjourney V8.1 Launches: Native 2K High-Resolution Rendering Speed and Cost Both Drop 3 Times, Image Prompt Function Returns Strongly!

Midjourney launches the V8.1Alpha version, which significantly improves upon V8. Key updates include a full upgrade of the high-resolution rendering mode, with native 2K high-resolution mode rendering speed increasing 3 times and cost reduced by 3 times, greatly enhancing generation efficiency and cost-effectiveness, providing creators with a smoother and more efficient experience.

Yaoke Media's First AIGC Short Drama 'The Mystery of the Bronze in Qinling' Launches Today, AI-Signed Actors Star in the Lead Roles

Yaoke Media launches AIGC fantasy suspense short drama 'Qinling Bronze Mystery', starring AI digital actors Qin Lingyue and Lin Xiyan. The story follows retired scout Qin Yue exploring a mining area in Qinling, uncovering truths behind mining disasters and generational blood sacrifices. It's among China's first dramas led entirely by AI performers.....

Apple Secretly Warned Musk: Grok Must Be Revised or It Will Be Removed

Apple privately warned X Corporation that its AI tool Grok posed a risk of generating inappropriate images, especially involving minors, which violates the App Store guidelines. If X does not make amendments, Apple will remove its application. This incident has recently drawn attention after becoming public through a letter from U.S. senators.

100B Anonymous Model Elephant Rises to Second Place on OpenRouter Trend List

Elephant Alpha, an anonymous model, quickly rose to the top of OpenRouter's trending list with a 377% surge in daily usage. With 100B parameters, it supports 256K context input and 32K output, offering faster response and lower resource use while matching the intelligence of leading models. Its origin remains debated, possibly a Chinese Flash version or an overseas model.....

Li Feifei's startup WorldLabs launches Spark2.0: The first cross-platform large-scale 3DGS renderer integrated with Three.js

WorldLabs, founded by Stanford professor Li Feifei, launched Spark2.0, a dynamic 3D Gaussian Splatting renderer for the web. It deeply integrates the Three.js framework, based on WebGL2 technology, breaks through hardware limitations, and supports accessing high-fidelity 3D content directly through browsers on desktops, iOS, Android, and VR devices.

Latest AI News

AI Daily Brief

AI Product Finder

AI Product Rankings

AI Product Submit

AI Tools Directory

GEO Brand Visibility

AI Visibility Audit

AI Search Visibility Checker

GEO Promotion Link Detection

GEO Ranking Optimization System

GEO Services​

MCP Servers

MCP Client

MCP Case Tutorials

MCP Ranking

MCP Service Submission

MCP Playground

MCP Inspector

LLM API Hub

AI Models Finder

Model Providers

LLM Leaderboard

Compare LLMs

LLM Cost Calculator

LLM Arena

AI Model Compatibility Checker

AI Deployment Calculator

Text-to-Image Generation is Here! Zhichu's Open Source Image Generation Model CogView3 is 10 Times Faster than SDXL!

AIbase基地

This article is from AIbase Daily

AI News Recommendations

Alibaba Launches AI Development Tool Meoo to Empower Zero-Barrier Creative Monetization!

AI Daily: Chrome Launches Skill Library for Gemini; Xiaohongshu Opensources Training Engine Relax; Midjourney V8.1 Released

Midjourney V8.1 Launches: Native 2K High-Resolution Rendering Speed and Cost Both Drop 3 Times, Image Prompt Function Returns Strongly!

Yaoke Media's First AIGC Short Drama 'The Mystery of the Bronze in Qinling' Launches Today, AI-Signed Actors Star in the Lead Roles

Apple Secretly Warned Musk: Grok Must Be Revised or It Will Be Removed

100B Anonymous Model Elephant Rises to Second Place on OpenRouter Trend List

Xiaohongshu Suddenly Open-Sources a Training Engine, RelaX AI Circle Gains Another Significant Player

Li Feifei's startup WorldLabs launches Spark2.0: The first cross-platform large-scale 3DGS renderer integrated with Three.js

Google Releases a New Windows Desktop AI Application - Easy to Search with Just Two Keys!

Cross-border E-commerce Enters the 3.0 Era, Amazon Relies on AI, Are the Players Ready?

AI News Recommendations

Alibaba Launches AI Development Tool Meoo to Empower Zero-Barrier Creative Monetization!

AI Daily: Chrome Launches Skill Library for Gemini; Xiaohongshu Opensources Training Engine Relax; Midjourney V8.1 Released

Midjourney V8.1 Launches: Native 2K High-Resolution Rendering Speed and Cost Both Drop 3 Times, Image Prompt Function Returns Strongly!

Yaoke Media's First AIGC Short Drama 'The Mystery of the Bronze in Qinling' Launches Today, AI-Signed Actors Star in the Lead Roles

Apple Secretly Warned Musk: Grok Must Be Revised or It Will Be Removed

100B Anonymous Model Elephant Rises to Second Place on OpenRouter Trend List

Xiaohongshu Suddenly Open-Sources a Training Engine, RelaX AI Circle Gains Another Significant Player

Li Feifei's startup WorldLabs launches Spark2.0: The first cross-platform large-scale 3DGS renderer integrated with Three.js

Google Releases a New Windows Desktop AI Application - Easy to Search with Just Two Keys!

Cross-border E-commerce Enters the 3.0 Era, Amazon Relies on AI, Are the Players Ready?

GEO Services