SpeechGPT is a multimodal language model with inherent cross-modal dialogue capabilities. It can perceive and generate multimodal content and follow multimodal human instructions. SpeechGPT-Gen is an extended information chain speech generation model. SpeechAgents is a multimodal multi-agent system for human communication simulation. SpeechTokenizer is a unified speech tokenizer suitable for speech language models. The release dates and related information of these models and datasets can be found on the official website.