Sana_1600M_1024px_MultiLing

A high-resolution, multi-language supported text-to-image generation model.

CommonProductImageText-to-ImageHigh Resolution
Sana is a text-to-image framework developed by NVIDIA, capable of efficiently generating images with resolutions up to 4096×4096. It synthesizes high-resolution, high-quality images at remarkable speeds while maintaining robust text-image alignment, making it deployable on laptop GPUs. The Sana model is based on linear diffusion transformers, utilizing pre-trained text encoders and spatially compressed latent feature encoders, supporting Emoji, Chinese, and English inputs, as well as mixed prompts.
Visit

Sana_1600M_1024px_MultiLing Visit Over Time

Monthly Visits

20899836

Bounce Rate

46.04%

Page per Visit

5.2

Visit Duration

00:04:57

Sana_1600M_1024px_MultiLing Visit Trend

Sana_1600M_1024px_MultiLing Visit Geography

Sana_1600M_1024px_MultiLing Traffic Sources

Sana_1600M_1024px_MultiLing Alternatives