VideoDrafter
Video Generation across Multiple Scenes with Consistent Content
CommonProductVideoVideo GenerationContent Consistency
VideoDrafter is a video generation framework that produces videos across multiple scenes with consistent content. It leverages a large language model (LLM) to transform input prompts into comprehensive scripts containing descriptions of events, foreground/background entities, and camera movements. VideoDrafter identifies shared entities within the script and prompts the LLM to provide detailed descriptions of each. These entity descriptions are then fed into a text-to-image model to generate reference images for each entity. Finally, by considering the reference images, event descriptions, and camera movements, a diffusion process is employed to generate the multi-scene video. The diffusion model utilizes the reference images as conditioning and alignment, enhancing the consistency of the content in the generated multi-scene video.
VideoDrafter Visit Over Time
Monthly Visits
19075321
Bounce Rate
45.07%
Page per Visit
5.5
Visit Duration
00:05:32