SpeechGPT

Multimodal Language Model

CommonProductProgrammingSpeechMultimodal
SpeechGPT is a multimodal language model with inherent cross-modal dialogue capabilities. It can perceive and generate multimodal content and follow multimodal human instructions. SpeechGPT-Gen is an extended information chain speech generation model. SpeechAgents is a multimodal multi-agent system for human communication simulation. SpeechTokenizer is a unified speech tokenizer suitable for speech language models. The release dates and related information of these models and datasets can be found on the official website.
Visit

SpeechGPT Visit Over Time

Monthly Visits

488643166

Bounce Rate

37.28%

Page per Visit

5.7

Visit Duration

00:06:37

SpeechGPT Visit Trend

SpeechGPT Visit Geography

SpeechGPT Traffic Sources

SpeechGPT Alternatives