AI News

Don't miss any moment of global AI innovation

AI Daily

Daily three-minute AI industry trends

AI Timeline

AI industry milestones

Al Hardware

Lists all AI hardware products.

AI Monetization Guide

Latest Cases

AI monetization case sharing

Image Collection

AI image creation monetization cases

Video Collection

AI video creation monetization cases

Audio Collection

AI audio creation monetization cases

Content Collection

AI content writing monetization cases

AI Tutorials

Latest Tutorials

Free sharing of the latest AI tutorials

AI Product Rankings

AI Product Ranking

Shows total visits ranking of AI websites

AI Traffic Growth Ranking

Track fastest growing AI websites by traffic

AI Traffic Decline Ranking

Focus on AI websites with significant traffic drops

AI Weekly Ranking

Shows weekly visits ranking of AI websites

Popular Country Rankings

United States

AI websites most popular with US users

China

AI websites most popular with Chinese users

India

AI websites most popular with Indian users

Brazil

AI websites most popular with Brazilian users

Popular Category Rankings

Image Generation

Total visits ranking of AI image generation websites

Personal Assistant

Total visits ranking of AI personal assistant websites

Character Generation

Total visits ranking of AI character generation websites

Video Generation

Total visits ranking of AI video generation websites

Popular Open Source Data Rankings

AI Project Ranking

GitHub popular AI projects by total stars

AI Project Growth Ranking

GitHub popular AI projects by growth rate

AI Developer Ranking

GitHub popular AI developer ranking

AI Organization Ranking

GitHub popular AI organization ranking

Popular Open Source Categories

Deepseek

GitHub popular deepseek open source projects

TTS

GitHub popular TTS open source projects

LLM

GitHub popular LLM open source projects

ChatGPT

GitHub popular ChatGPT open source projects

AI Open Source Project Library

Overview

Overview of GitHub popular AI open source projects

Product Library Tool Navigation

Make-An-Audio 2

Text-to-audio generation technology based on diffusion models

CommonProductOthersText-to-audioDiffusion models

Visit

Make-An-Audio 2 is a text-to-audio generation technology based on diffusion models, co-developed by researchers from Zhejiang University, ByteDance, and the Chinese University of Hong Kong. This technology utilizes pre-trained large language models (LLMs) to parse text, optimizing for semantic alignment and temporal consistency, thereby improving the quality of generated audio. It also incorporates a feed-forward Transformer-based diffusion denoiser to enhance performance in generating variable-length audio and bolster the extraction of temporal information. Furthermore, by leveraging LLMs to convert abundant audio label data into audio-text datasets, the issue of time data scarcity is addressed.

Visit

Make-An-Audio 2 Visit Over Time

Monthly Visits

504

Bounce Rate

41.84%

Page per Visit

1.0

Visit Duration

00:00:00

Make-An-Audio 2 Visit Trend

Make-An-Audio 2 Visit Geography

Make-An-Audio 2 Traffic Sources

Make-An-Audio 2 Alternatives

Make-An-Audio 2 — Text-to-audio generation technology based on diffusion models

Others

•Text-to-audio•Diffusion models

300

InfiniteYou — Achieve flexible and high-fidelity image generation while preserving identity characteristics.

Productivity

•Image Generation•Identity Preservation

828

On-device Sora — On-device Sora is a mobile device text-to-video generation project based on diffusion models.

Video

•Video Generation•Mobile Devices

252

DiffSplat — DiffSplat is a generative framework that produces 3D Gaussian point clouds from text prompts and single-view images.

Image

•3D Generation•Gaussian Point Clouds

270

Go with the Flow — An efficient method for controlling motion patterns in video diffusion models, supporting customization and transfer of motion modes.

Video

•Video Generation•Motion Control

450

Flux-Midjourney-Mix2-LoRA — A text-to-image generation model based on the Midjourney style, focusing on high-resolution and realistic image creation.

Image

•Text-to-Image•Deep Learning

564

TokenVerse — TokenVerse is a novel multi-concept personalization method based on a pre-trained text-to-image diffusion model.

Image

•Image Generation•Personalization

504

Hunyuan3D 2.0 — Hunyuan3D 2.0 is a high-resolution 3D asset generation system launched by Tencent, based on large-scale diffusion models.

ChineseSelection

•3D•Texture Generation

2520

PaSa — PaSa is an advanced academic paper search agent driven by large language models, capable of autonomous decision-making and obtaining accurate results.

Education

•Academic Search•Large Language Models

762

self-adaptive-llms — A real-time adaptive framework for unseen tasks using large language models.

Programming

•Artificial Intelligence•Large Language Models

258

SeedVR — SeedVR: A diffusion transformer model designed for general video restoration

Video

•Video Restoration•Diffusion Models

282

DiffSensei — Customized comic generation model, connecting multimodal LLMs and diffusion models.

Image

•Comic Generation•Multimodal

1272

FlagEval — Model Evaluation Platform

Others

•Model Evaluation•Artificial Intelligence

234

InvSR — Multi-step image super-resolution model based on diffusion inversion.

Image

•Image Super-resolution•Diffusion Models

522

CosyVoice 2 — Scalable streaming voice synthesis technology powered by large language models.

Productivity

•Voice Synthesis•Streaming

1146

Leffa — Controllable character image generation model

Image

•Image Generation•Virtual Fitting

930

Command R7B — Fast and Efficient Generative AI Model

Productivity

•Machine Learning•Large Language Models

258

ComfyUI_HelloMeme — A tool for image and video generation based on diffusion models.

Image

•Image Generation•Video Generation

660

MLPerf Client — Personal Computer AI Performance Benchmarking

Productivity

•AI Performance Testing•Benchmarking

198

InternVL2_5-38B — Advanced Multimodal Large Language Model Series

Image

•Multimodal•Large Language Models

432

Color-diffusion — Using diffusion models for colorizing black and white images.

Image

•Image Coloring•Diffusion Models

276

Sandbox Fusion — A multifunctional code sandbox suitable for large language models.

Programming

•Code Sandbox•Multilanguage Support

270

text-to-pose — A model for generating poses from text and further generating images.

Image

•Text-to-image•Pose estimation

318

Diffusion Self-Distillation — A diffusion self-distillation technique for zero-shot custom image generation.

Image

•Image Generation•Zero-shot Learning

1422

Star-Attention — EfficientInference Technology for Long Sequence Large Language Models

Programming

•NVIDIA•Large Language Models

228

CAT4D — 4D scene creation tool utilizing multi-view video diffusion models.

Image

•4D Scenes•Multi-view Video

480

Model Context Protocol Servers — A collection of reference implementations and community-contributed servers for the Model Context Protocol.

Programming

•Model Context Protocol•Large Language Models

534

AI News

AI Daily

AI Timeline

Al Hardware

Latest Cases

Image Collection

Video Collection

Audio Collection

Content Collection

Latest Tutorials

AI Product Ranking

AI Traffic Growth Ranking

AI Traffic Decline Ranking

AI Weekly Ranking

United States

China

India

Brazil

Image Generation

Personal Assistant

Character Generation

Video Generation

AI Project Ranking

AI Project Growth Ranking

AI Developer Ranking

AI Organization Ranking

Deepseek

TTS

LLM

ChatGPT

Overview

Make-An-Audio 2

Make-An-Audio 2 Visit Over Time

Make-An-Audio 2 Visit Trend

Make-An-Audio 2 Visit Geography

Make-An-Audio 2 Traffic Sources

Make-An-Audio 2 Alternatives

Make-An-Audio 2 — Text-to-audio generation technology based on diffusion models

InfiniteYou — Achieve flexible and high-fidelity image generation while preserving identity characteristics.

On-device Sora — On-device Sora is a mobile device text-to-video generation project based on diffusion models.

DiffSplat — DiffSplat is a generative framework that produces 3D Gaussian point clouds from text prompts and single-view images.

Go with the Flow — An efficient method for controlling motion patterns in video diffusion models, supporting customization and transfer of motion modes.

Flux-Midjourney-Mix2-LoRA — A text-to-image generation model based on the Midjourney style, focusing on high-resolution and realistic image creation.

TokenVerse — TokenVerse is a novel multi-concept personalization method based on a pre-trained text-to-image diffusion model.

Hunyuan3D 2.0 — Hunyuan3D 2.0 is a high-resolution 3D asset generation system launched by Tencent, based on large-scale diffusion models.

PaSa — PaSa is an advanced academic paper search agent driven by large language models, capable of autonomous decision-making and obtaining accurate results.

self-adaptive-llms — A real-time adaptive framework for unseen tasks using large language models.

SeedVR — SeedVR: A diffusion transformer model designed for general video restoration

Sonus-1 — Sonus-1: A New Era of Large Language Models (LLMs)

VMix — A tool for enhancing aesthetic quality in text-to-image diffusion models

TangoFlux — An efficient text-to-audio generation model

DiffSensei — Customized comic generation model, connecting multimodal LLMs and diffusion models.

FlagEval — Model Evaluation Platform

InvSR — Multi-step image super-resolution model based on diffusion inversion.

CosyVoice 2 — Scalable streaming voice synthesis technology powered by large language models.

Leffa — Controllable character image generation model

Command R7B — Fast and Efficient Generative AI Model

ComfyUI_HelloMeme — A tool for image and video generation based on diffusion models.

MLPerf Client — Personal Computer AI Performance Benchmarking

InternVL2_5-38B — Advanced Multimodal Large Language Model Series

Color-diffusion — Using diffusion models for colorizing black and white images.

Sandbox Fusion — A multifunctional code sandbox suitable for large language models.

text-to-pose — A model for generating poses from text and further generating images.

Diffusion Self-Distillation — A diffusion self-distillation technique for zero-shot custom image generation.

Star-Attention — EfficientInference Technology for Long Sequence Large Language Models

CAT4D — 4D scene creation tool utilizing multi-view video diffusion models.

Model Context Protocol Servers — A collection of reference implementations and community-contributed servers for the Model Context Protocol.