ChatTTS is a voice generation model designed for conversational scenarios, particularly suitable for dialogue tasks of large language model assistants and conversational audio and video introductions. It supports both English and Chinese and showcases high-quality and natural speech synthesis capabilities through training on approximately 100,000 hours of English and Chinese data.