VoiceCraft
Zero-shot voice editing and text-to-speech technology
CommonProductProductivityVoice EditingText-to-Speech
VoiceCraft is a token-filling based neural encoder-decoder language model that achieves leading performance in voice editing and zero-shot text-to-speech (TTS). For unseen voices, VoiceCraft only needs a few seconds of voice samples to clone the voice or edit the recording. The model is suitable for wild data such as audiobooks, online videos, and podcasts.
VoiceCraft Visit Over Time
Monthly Visits
8234
Bounce Rate
56.04%
Page per Visit
1.0
Visit Duration
00:00:03