Emotion-TTS-Emebddings

Public

This project explores zero-shot emotional speech synthesis using EMOD, a novel approach combining emotion and content embeddings for multilingual and cross-lingual emotion transfer. Built on a VITS-based TTS model, it preserves speaker identity while enhancing expressiveness, enabling emotion transfer across languages and genders efficiently.

emotional-speech-synthesis end-to-end few-shot few-shot-learning low-resource-languages speech-synthesis text-to-speech zero-shot-learning

Creat：2024-09-12T21:00:16

Update：2025-03-15T22:11:49

https://nn-project-2.github.io/Emotion-TTS-web/

Stars

Stars Increase

Related projects

Wav2letter

cpp

Facebook AI Research's Automatic Speech Recognition Toolkit

6422

4个月前

Learn2learn

few-shot

A PyTorch Library for Meta-learning Research

2760

1个月前

+2today

Deepvoice3_pytorch

end-to-end

PyTorch implementation of convolutional neural networks-based text-to-speech synthesis models

1977

1个月前

Custom Diffusion

computer-vision

Custom Diffusion: Multi-Concept Customization of Text-to-Image Diffusion (CVPR 2023)

1936

1个月前

+1today

FSL Mate

deep-learning

FSL-Mate: A collection of resources for few-shot learning (FSL).

1742

1个月前

+1today

Awesome Diffusion Categorized

continual-learning

collection of diffusion model papers categorized by their subareas

1678

1个月前

+4today

MapTR

autonomous-driving

[ICLR'23 Spotlight & IJCV'24] MapTR: Structured Modeling and Learning for Online Vectorized HD Map Construction

1264

1个月前

+2today

Prototypical Networks

deep-learning

Code for the NeurIPS 2017 Paper "Prototypical Networks for Few-shot Learning"

1163

1个月前

+1today

Espresso

asr

Espresso: A Fast End-to-End Neural Speech Recognition Toolkit

943

1个月前

VideoChat

asr

实时语音交互数字人，支持端到端语音方案（GLM-4-Voice - THG）和级联方案（ASR-LLM-TTS-THG）。可自定义形象与音色，无须训练，支持音色克隆，首包延迟低至3s。Real-time voice interactive digital human, supporting end-to-end voice solutions (GLM-4-Voice - THG) and cascaded solutions (ASR-LLM-TTS-THG). Customizable appearance and voice, supporting voice cloning, with initial package delay as low as 3s.

894

1个月前

+2today

AI News

AI Daily

AI Timeline

Al Hardware

Latest Cases

Image Collection

Video Collection

Audio Collection

Content Collection

Latest Tutorials

AI Product Ranking

AI Traffic Growth Ranking

AI Traffic Decline Ranking

AI Weekly Ranking

United States

China

India

Brazil

Image Generation

Personal Assistant

Character Generation

Video Generation

AI Project Ranking

AI Project Growth Ranking

AI Developer Ranking

AI Organization Ranking

Deepseek

TTS

LLM

ChatGPT

Overview

Emotion-TTS-Emebddings

Related projects

Wav2letter

Learn2learn

Deepvoice3_pytorch

Custom Diffusion

FSL Mate

Awesome Diffusion Categorized

MapTR

Prototypical Networks

Espresso

VideoChat