Sana_1600M_512px_MultiLing
High-resolution, multilingual text-to-image generation model
CommonProductImageText-to-imageHigh-resolution
Sana is a text-to-image framework developed by NVIDIA, capable of efficiently generating images with resolutions up to 4096×4096. It synthesizes high-resolution, high-quality images at an extremely fast speed, featuring strong text-image alignment capabilities and deployable on laptop GPUs. The model is based on linear diffusion transformers, utilizing a fixed pre-trained text encoder and a space-compressed latent feature encoder, supporting mixed prompts in English, Chinese, and emojis. The key advantages of Sana include high efficiency, high-resolution image generation capability, and multilingual support.
Sana_1600M_512px_MultiLing Visit Over Time
Monthly Visits
21315886
Bounce Rate
45.50%
Page per Visit
5.2
Visit Duration
00:05:02