VideoVAEPlus
High-fidelity video encoding suitable for video auto-encoders in large motion scenes.
CommonProductVideoVideo EncodingVariational Autoencoder
This is a video variational autoencoder (VAE) designed to reduce video redundancy and facilitate efficient video generation. The model extends image VAE to 3D VAE, discovering that this results in motion blur and detail distortion, prompting the introduction of time-aware spatial compression for better encoding and decoding of spatial information. Additionally, the model incorporates a lightweight motion compression model for further temporal compression. By utilizing inherent textual information from text-to-video datasets and incorporating text guidance into the model, it significantly enhances reconstruction quality, particularly in detail retention and temporal stability. The model also improves its versatility through joint training on images and videos, enhancing both reconstruction quality and capabilities for auto-encoding images and videos. Extensive evaluations indicate that this approach outperforms recent strong baselines.
VideoVAEPlus Visit Over Time
Monthly Visits
323
Bounce Rate
42.41%
Page per Visit
1.0
Visit Duration
00:00:00