In the "AI+Social" arena within China, Soul App is poised to infuse new vitality with AI!

Recently, Soul officially announced an upgrade to its voice large model, launching its self-developed end-to-end full-duplex voice call large model.

The most impressive effect of this upgrade is that it allows users to have voice calls with virtual characters as naturally and smoothly as chatting with real people!

How realistic is it? You can get a feel by watching the video below:

Official demonstration of "experiencing real-time AI call" examples

So, what makes Soul's self-developed end-to-end voice call large model special? According to official descriptions, its key highlights include:

  • Ultra-low interaction latency

  • Quick automatic interruption

  • Super-realistic voice expression

  • Emotion perception and understanding capabilities

Ultra-low interaction latency means that as soon as you speak, the AI can immediately respond without any delay, instantly bridging the gap between you and the AI. To have a real conversation with it, you don't need to wait at all, it's just like talking to a real person.

Soul's voice large model supports the quick automatic interruption feature. This means that when you communicate with the AI, if you want to interject, it can fully understand your intention and easily interrupt the other party, making this interaction really fun!

Finally, with super-realistic voice expression and emotion perception and understanding capabilities, the AI not only understands what you say but also senses your emotions and responds appropriately based on them.

Combining with the official video example, if this feature is fully launched later, it's estimated that a wave of users on Soul might not be able to tell the difference between real people and AI virtual characters.

Soul stated that its end-to-end voice call large model has already been applied to the "Echoes of Another World" real-time call scenario (in beta testing) and will be expanded to multiple AI companionship and interaction scenarios in the future, such as AI Gou Dan.

QQ20240905-115505.png

It is understood that as early as 2020, Soul had initiated AIGC technology research, focusing on the development of key technologies such as intelligent dialogue, voice technology, and virtual characters, and deeply integrating these AI capabilities into social scenarios.

In the process of upgrading social interactions with AI, Soul particularly emphasizes achieving anthropomorphic and natural emotional companionship experiences.

To provide users with better emotional feedback and companionship, the Soul technical team has been focusing on emotional understanding and latency issues. They have launched self-developed voice generation large models, voice recognition large models, voice dialogue large models, music generation large models, etc., supporting real voice generation, voice DIY, multilingual switching, multi-emotion simulation of real-time dialogue with humans, and these have been applied in multiple scenarios of Soul, such as "AI Gou Dan", "Werewolf Shadow" AI voice real-time interaction, "Echoes of Another World", etc.

The launch of Soul's self-developed end-to-end voice call large model means that users can enjoy a more natural human-computer interaction experience. In the future, Soul also plans to further promote the construction of multi-modal end-to-end large model capabilities, making human-AI interactions more interesting and immersive.