BASE TTS

Amazon's Large-scale Voice Synthesis Model

CommonProductOthersVoice SynthesisNatural Language Processing
BASE TTS is a large-scale text-to-speech synthesis model developed by Amazon. It employs an auto-regressive transformer with over 1 billion parameters to convert text into speech codes and then generates speech waveforms using a convolutional decoder. Trained on more than 100,000 hours of public speech data, this model achieves a new level of naturalness in speech. It also incorporates innovative speech encoding techniques such as phoneme separation and compression. As the model's scale grows, BASE TTS demonstrates its ability to handle complex sentences with natural prosody.
Visit

BASE TTS Visit Over Time

Monthly Visits

358158

Bounce Rate

58.94%

Page per Visit

2.0

Visit Duration

00:00:53

BASE TTS Visit Trend

BASE TTS Visit Geography

BASE TTS Traffic Sources

BASE TTS Alternatives