BASE TTS
Amazon's Large-scale Voice Synthesis Model
CommonProductOthersVoice SynthesisNatural Language Processing
BASE TTS is a large-scale text-to-speech synthesis model developed by Amazon. It employs an auto-regressive transformer with over 1 billion parameters to convert text into speech codes and then generates speech waveforms using a convolutional decoder. Trained on more than 100,000 hours of public speech data, this model achieves a new level of naturalness in speech. It also incorporates innovative speech encoding techniques such as phoneme separation and compression. As the model's scale grows, BASE TTS demonstrates its ability to handle complex sentences with natural prosody.
BASE TTS Visit Over Time
Monthly Visits
358158
Bounce Rate
58.94%
Page per Visit
2.0
Visit Duration
00:00:53