SparseCtrl
Adds sparse control to text-to-video diffusion models
CommonProductImageText-to-VideoSparse Control
SparseCtrl is developed to enhance the controllability of text-to-video generation. It enables flexible structural control by combining sparse signals with only one or a few inputs. It includes an additional conditional encoder to process these sparse signals without affecting the pre-trained text-to-video model. This method is compatible with various formats, including sketches, depth, and RGB images, providing more practical control for video generation and pushing applications like storyboarding, depth rendering, keyframe animation, and interpolation. Extensive experiments demonstrate SparseCtrl's generalization ability on both original and personalized text-to-video generators.
SparseCtrl Visit Over Time
Monthly Visits
48
Bounce Rate
43.57%
Page per Visit
1.0
Visit Duration
00:00:00