2024-07-25 14:24:32.AIbase.10.6k
Comparable to GPT-4o! Fudan Introduces SpeechGPT2, a Voice Model that Can Understand Your Emotions
Fudan University has introduced SpeechGPT, a large language model designed to understand and generate both speech and text. By discretizing speech signals, it enables compatibility with text modalities, allowing for emotional perception and multi-style speech generation based on context and instructions. The training strategy includes modal adaptation pre-training, cross-modal instruction fine-tuning, and modal chain instruction fine-tuning to en....