OuteTTS-0.2-500M is a text-to-speech synthesis model built on Qwen-2.5-0.5B. It has been trained on a larger dataset, achieving significant improvements in accuracy, naturalness, vocabulary range, voice cloning capability, and multilingual support. Special thanks to Hugging Face for the GPU funding that supported this model's training.