Show-1
Show-1 combines pixel and latent diffusion models to achieve efficient, high-quality text-to-video generation.
CommonProductVideoText-to-VideoVideo Generation
Show-1 is an efficient text-to-video generation model that integrates pixel-level and latent variable-level diffusion models. It is capable of generating videos highly relevant to the text and can produce high-quality videos with lower computational resource demands. The model initially generates a low-resolution preliminary video using a pixel-level model and then upsamples it to high resolution using a latent variable model, thus leveraging the benefits of both models. Compared to pure latent variable models, Show-1's generated videos have more accurate text-relevance; and in comparison to pure pixel models, it offers lower computational costs.
Show-1 Visit Over Time
Monthly Visits
20087
Bounce Rate
46.55%
Page per Visit
1.5
Visit Duration
00:00:03