OuteTTS-0.2-500M
High-performance text-to-speech synthesis model
CommonProductMusicText-to-SpeechSpeech Synthesis
OuteTTS-0.2-500M is a text-to-speech synthesis model built on Qwen-2.5-0.5B. It has been trained on a larger dataset, achieving significant improvements in accuracy, naturalness, vocabulary range, voice cloning capability, and multilingual support. Special thanks to Hugging Face for the GPU funding that supported this model's training.
OuteTTS-0.2-500M Visit Over Time
Monthly Visits
20899836
Bounce Rate
46.04%
Page per Visit
5.2
Visit Duration
00:04:57