VoiceCraft

Zero-shot voice editing and text-to-speech technology

CommonProductProductivityVoice EditingText-to-Speech
VoiceCraft is a token-filling based neural encoder-decoder language model that achieves leading performance in voice editing and zero-shot text-to-speech (TTS). For unseen voices, VoiceCraft only needs a few seconds of voice samples to clone the voice or edit the recording. The model is suitable for wild data such as audiobooks, online videos, and podcasts.
Visit

VoiceCraft Visit Over Time

Monthly Visits

8234

Bounce Rate

56.04%

Page per Visit

1.0

Visit Duration

00:00:03

VoiceCraft Visit Trend

VoiceCraft Visit Geography

VoiceCraft Traffic Sources

VoiceCraft Alternatives