Alibaba Tongyi Laboratory Voice Generation Model CosyVoice Upgraded to Version 2.0

AIbase基地

Published inAI News · 3 min read · Dec 16, 2024

652

The Alibaba Tongyi Lab's voice team has announced that its open-source voice generation model, CosyVoice, has been upgraded to version 2.0. This upgrade marks significant advancements in voice generation technology regarding accuracy, stability, and natural experience. CosyVoice 2.0 utilizes an integrated modeling technique for offline and streaming voice generation, achieving bidirectional streaming voice synthesis, with an initial synthesis delay of as low as 150ms, greatly enhancing the responsiveness of voice synthesis.

WeChat Screenshot_20241216105354.png

In terms of pronunciation accuracy, CosyVoice 2.0 has reduced the error rate by 30% to 50% compared to the previous version, achieving the lowest character error rate on the hard test set of the Seed-TTS test set, particularly excelling in synthesizing tongue twisters, homophones, and rare characters. Furthermore, version 2.0 maintains tonal consistency in zero-shot voice generation and cross-language voice synthesis, showing a marked improvement in cross-language capabilities compared to version 1.0.

CosyVoice 2.0 has also enhanced the prosody, sound quality, and emotional matching of synthesized audio, with the Mean Opinion Score (MOS) rating rising from 5.4 to 5.53, approaching the score of a certain commercial voice synthesis model. Additionally, version 2.0 supports more granular emotional control and dialect accent control, offering users a richer selection of languages, including Cantonese, Sichuanese, Zhengzhou dialect, Tianjin dialect, and Changsha dialect, as well as role-playing features, such as mimicking robots and speaking in the style of Peppa Pig.

The upgrade of CosyVoice 2.0 not only enhances the technology and experience of voice synthesis but also further promotes the development of the open-source community, encouraging more developers to engage in the innovation and application of voice processing technology.

GitHub Repository: CosyVoice (https://github.com/FunAudioLLM/CosyVoice) for the latest updates on CosyVoice 2
Online Experience DEMO: https://www.modelscope.cn/studios/iic/CosyVoice2-0.5B
Open Source Code: https://github.com/FunAudioLLM/CosyVoice
Open Source Model: https://www.modelscope.cn/models/iic/CosyVoice2-0.5B

CosyVoice Voice Generation Alibaba Large Model

This article is from AIbase Daily

Welcome to the [AI Daily] column! This is your daily guide to exploring the world of artificial intelligence. Every day, we present you with hot topics in the AI field, focusing on developers, helping you understand technical trends, and learning about innovative AI product applications.

—— Created by the AIbase Daily Team

AI News Recommendations

Major Update! WeChat Welcomes its First AI Assistant, "Yuanbao", Revolutionizing Chat Experience

Just now, Tencent's AI assistant, "Yuanbao," officially joined WeChat. Users can now search "Yuanbao" in the WeChat search bar or scan the QR code to add it to their contact list and start a conversation. Yuanbao AI is the intelligent assistant of the Tencent Yuanbao APP on WeChat, equipped with dual-engine technology powered by HunYuan and DeepSeek, seamlessly integrating with the WeChat ecosystem. It leverages Tencent's HunYuan large model and DeepSeek to provide services including chatting and Q&A, appearing as a contact in the WeChat contact list.

Apr 16, 2025

150

Fenbi AI Wins Prestigious Awards at China AIGC Industry Summit 2025

At the recent 3rd China AIGC Industry Summit, Fenbi AI received two significant awards: "2025 Notable AIGC Enterprise" and "2025 Notable AIGC Product", recognizing its outstanding achievements in vocational education. Hosted by QuantumBit, the summit, themed "Everything is AI," attracted numerous experts and professionals from the technology and education sectors. Fenbi's CTO, Jianhua Chen, delivered a keynote speech titled "Reshaping and Practicing Intelligent Education: The Landing Path of Large Models".

Apr 16, 2025

150

Doubao 1.5 Deep Thinking Model to Officially Serve Enterprises

Volcano Engine today released an event pre-heating announcement, officially declaring that its new Doubao large model will be officially launched tomorrow (April 17th) at the FORCE LINK AI Innovation Tour Hangzhou station. Official information shows that this release will bring a brand-new upgrade to the Doubao large model family and will particularly introduce the highly anticipated Doubao 1.5 Deep Thinking Model. Previous reports revealed that the Doubao APP and desktop client have undergone multiple rounds of experimental testing for the "Deep Thinking Mode." After this release, the model will officially provide services to enterprise clients. It is understood that Doubao...

Apr 16, 2025

110

AI and Autonomous Driving Convergence: MogoAI Creates a New Smart Transportation Experience in Haikou

The fifth China International Consumer Products Expo recently opened in Haikou. At this expo, Mogo AI, in collaboration with the Hainan Provincial Department of Industry and Information Technology, the Haikou Municipal People's Government, and Hainan Expressway Company, launched a striking demonstration project. The core of this project is the application of AI large models to smart transportation, showcasing a "full-scenario, multi-functional" vehicle-road-cloud integrated solution. Image Note: Image generated by AI, image authorized by Midjourney in Haikou.

Apr 16, 2025

140

National Supercomputing Internet Platform Launches MiniMax Domestic AI Large Model, Boosting AI Open-Source Ecosystem and Intelligent Interaction

China's AI industry is accelerating its journey to the global stage. AIbase learned from social media that the National Supercomputing Internet Platform has officially launched MiniMax's domestic AI large models, including MiniMax-Text-01 and MiniMax-VL-01, and they have joined the supercomputing internet AI open-source community. Simultaneously, MiniMax's ChatBot dialogue service has also been integrated into the platform, providing users with a highly efficient intelligent interaction experience. The following is AIbase's in-depth analysis of this significant development.

Apr 16, 2025

120

National Supercomputing Platform Releases New Generation Multimodal Large Model to Advance AI Agent Development

Apr 16, 2025

100

Alibaba Cloud AIStack Large Model Appliance Makes Debut, Offering Cost-Effective AI Solutions for Enterprises

At the 8th Digital China Summit, Alibaba Cloud unveiled its new AIStack large model appliance, marking another significant advancement in its enterprise-grade AI solutions. This appliance integrates hardware and software for deep optimization, aiming to provide lightweight and cost-effective intelligent services to various industries including government, energy, and healthcare. The launch of AIStack is Alibaba Cloud's positive response to market demand for efficient and economical AI services. Designed specifically for enterprises, AIStack...

Apr 16, 2025

160

SpatialLM: An Open-Source 3D Vision Large Model for Real-time Scene Understanding

In the field of artificial intelligence, 3D vision and spatial understanding are becoming crucial for advancements in embodied AI, autonomous navigation, and virtual reality applications. In March 2025, Hangzhou Qunhe Technology announced the open-sourcing of its self-developed 3D vision large language model, SpatialLM, at the GTC2025 global conference, generating significant industry interest. This model, with its powerful spatial cognition capabilities and cost-effective data processing, offers revolutionary breakthroughs for robot training, architectural design, and AR/VR applications. AIbase is based on the latest information...

Apr 16, 2025

Zhipu AI Officially Launches IPO Guidance, Aiming to Become the First Large Model Company on the A-share Market

Recently, Beijing Zhipu Huazhang Technology Co., Ltd. (referred to as "Zhipu AI") submitted its initial public offering (IPO) guidance registration to the Beijing Securities Regulatory Bureau, becoming the first "Big Six Tiger" large model company to launch an IPO. If all goes well, Zhipu AI is expected to become the first large model company listed on the Chinese A-share market. Founded in 2019, Zhipu AI originated from the technology transfer of Tsinghua University's Computer Science Department. Since its establishment, the company has rapidly developed based on its strong technological capabilities and team background.

Apr 15, 2025

150

Xunlei Upgrade: One-Click Download for Large Models, Enjoy Accelerated Experience!

In today's rapidly developing AI landscape, developers often need to download massive model files. Traditional methods of downloading individual files one by one are time-consuming and cumbersome, and organizing the resulting files can be a headache. To address this, Xunlei recently released an updated plugin with significant upgrades for large model downloads, offering a seamless experience with automatic loading of complete files, intelligent archiving, and one-click download. The upgraded one-click download feature is designed to dramatically improve download efficiency.

Apr 15, 2025

AI News

AI Daily

AI Timeline

Al Hardware

Latest Cases

Image Collection

Video Collection

Audio Collection

Content Collection

Latest Tutorials

AI Product Ranking

AI Traffic Growth Ranking

AI Traffic Decline Ranking

AI Weekly Ranking

United States

China

India

Brazil

Image Generation

Personal Assistant

Character Generation

Video Generation

AI Project Ranking

AI Project Growth Ranking

AI Developer Ranking

AI Organization Ranking

Deepseek

TTS

LLM

ChatGPT

Overview

Alibaba Tongyi Laboratory Voice Generation Model CosyVoice Upgraded to Version 2.0

AIbase基地

This article is from AIbase Daily

AI News Recommendations

Major Update! WeChat Welcomes its First AI Assistant, "Yuanbao", Revolutionizing Chat Experience

Fenbi AI Wins Prestigious Awards at China AIGC Industry Summit 2025

Doubao 1.5 Deep Thinking Model to Officially Serve Enterprises

AI and Autonomous Driving Convergence: MogoAI Creates a New Smart Transportation Experience in Haikou

National Supercomputing Internet Platform Launches MiniMax Domestic AI Large Model, Boosting AI Open-Source Ecosystem and Intelligent Interaction

National Supercomputing Platform Releases New Generation Multimodal Large Model to Advance AI Agent Development

Alibaba Cloud AIStack Large Model Appliance Makes Debut, Offering Cost-Effective AI Solutions for Enterprises

SpatialLM: An Open-Source 3D Vision Large Model for Real-time Scene Understanding

Zhipu AI Officially Launches IPO Guidance, Aiming to Become the First Large Model Company on the A-share Market

Xunlei Upgrade: One-Click Download for Large Models, Enjoy Accelerated Experience!