Sana_1600M_1024px_MultiLing
A high-resolution, multi-language supported text-to-image generation model.
CommonProductImageText-to-ImageHigh Resolution
Sana is a text-to-image framework developed by NVIDIA, capable of efficiently generating images with resolutions up to 4096×4096. It synthesizes high-resolution, high-quality images at remarkable speeds while maintaining robust text-image alignment, making it deployable on laptop GPUs. The Sana model is based on linear diffusion transformers, utilizing pre-trained text encoders and spatially compressed latent feature encoders, supporting Emoji, Chinese, and English inputs, as well as mixed prompts.
Sana_1600M_1024px_MultiLing Visit Over Time
Monthly Visits
20899836
Bounce Rate
46.04%
Page per Visit
5.2
Visit Duration
00:04:57