Today, ByteDance announced a new feature for Doubao's large model that supports real-time voice calls.

It is reported that the conversational AI real-time interaction solution provided by Volcano Engine combines the Volcano Fangzhou large model service platform with Doubao's voice recognition and synthesis models, simplifying the process of converting voice to text and text to voice. This solution achieves efficient collection, processing, and transmission of voice data, providing excellent intelligent dialogue and natural language processing capabilities.

ByteDance Douyin Doubao Large Model

Volcano Engine RTC, based on audio 3A processing technology, effectively solves the "double talk" phenomenon, ensuring the accuracy and real-time nature of voice recognition. At the same time, using the WebRTC transmission network, it achieves ultra-low latency, stable, and reliable real-time audio and video transmission services globally.

Volcano Engine also offers flexible and diversified access solutions, including self-integration and WebRTC standard protocol-based transmission network solutions, to meet the specific needs of different enterprises.

Additionally, Volcano Engine's large model multi-modal real-time interaction service has provided AI real-time voice capabilities for some leading domestic AI virtual character chat applications, bringing a new interactive experience. Volcano Engine will continue to provide high-quality audio and video capabilities and AI capabilities to help enterprises innovate in the field of AI real-time audio and video.