audio2photoreal
Transform audio into photo-realistic human avatars
CommonProductImageArtificial IntelligenceVoice Synthesis
audio2photoreal is an open-source project that generates photo-realistic avatars from audio. It includes a PyTorch implementation capable of synthesizing human images from dialogue in audio. The project provides training code, test code, pre-trained motion models, and access to datasets. Its models consist of facial diffusion models, body diffusion models, body VQ-VAE models, and body guiding transformer models. This project allows researchers and developers to train their own models and create high-quality, realistic avatars based on voice synthesis.
audio2photoreal Visit Over Time
Monthly Visits
488643166
Bounce Rate
37.28%
Page per Visit
5.7
Visit Duration
00:06:37