Tencent EMMA

Multimodal Text-to-Image Generation Model

PremiumNewProductImageImage GenerationMultimodal
EMMA is a novel image generation model built upon the state-of-the-art text-to-image diffusion model ELLA. It can accept multimodal prompts and, through its innovative multimodal feature connector design, effectively integrates text and supplementary modal information. This model, by freezing all parameters of the original T2I diffusion model and only adjusting some additional layers, reveals the interesting property that pre-trained T2I diffusion models can secretly accept multimodal prompts. EMMA is easy to adapt to different existing frameworks, making it a flexible and effective tool for generating personalized and context-aware images even videos.
Visit

Tencent EMMA Visit Over Time

Monthly Visits

62

Bounce Rate

40.66%

Page per Visit

1.0

Visit Duration

00:00:00

Tencent EMMA Visit Trend

Tencent EMMA Visit Geography

Tencent EMMA Traffic Sources

Tencent EMMA Alternatives