UniVG
Unified Multi-Modal Video Generation System
CommonProductImageVideo GenerationMulti-Modal
UniVG is a unified multi-modal video generation system that can handle various video generation tasks, including text and image modalities. By introducing multi-condition cross-attention and biased Gaussian noise, it achieves both high-freedom and low-freedom video generation. On the public academic benchmark MSR-VTT, it achieved the lowest Fréchet video distance (FVD), surpassing the performance of current open-source methods in human evaluation, and comparable to the current closed-source method Gen2.
UniVG Visit Over Time
Monthly Visits
19075321
Bounce Rate
45.07%
Page per Visit
5.5
Visit Duration
00:05:32