SpeechGPT 2.0-preview

The first human-level real-time interactive system focused on contextual intelligence, supporting multi-emotional and multi-style voice interactions.

CommonProductchattingVoice InteractionArtificial Intelligence
SpeechGPT 2.0-preview is an advanced voice interaction model developed by the Natural Language Processing Laboratory at Fudan University. It employs vast amounts of voice data for training, achieving low-latency and highly natural speech interaction capabilities. The model simulates various emotional, stylistic, and role-based voice expressions while supporting tool invocation, online search, and access to external knowledge bases. Key advantages include strong voice style generalization, multi-role simulation, and low-latency interaction experience. Currently, the model supports Chinese voice interaction, with plans to expand to more languages in the future.
Visit

SpeechGPT 2.0-preview Alternatives