VideoDrafter
Video Generation across Multiple Scenes with Consistent Content
CommonProductVideoVideo GenerationContent Consistency
VideoDrafter is a video generation framework that produces videos across multiple scenes with consistent content. It leverages a large language model (LLM) to transform input prompts into comprehensive scripts containing descriptions of events, foreground/background entities, and camera movements. VideoDrafter identifies shared entities within the script and prompts the LLM to provide detailed descriptions of each. These entity descriptions are then fed into a text-to-image model to generate reference images for each entity. Finally, by considering the reference images, event descriptions, and camera movements, a diffusion process is employed to generate the multi-scene video. The diffusion model utilizes the reference images as conditioning and alignment, enhancing the consistency of the content in the generated multi-scene video.
VideoDrafter Visit Over Time
Monthly Visits
20899836
Bounce Rate
46.04%
Page per Visit
5.2
Visit Duration
00:04:57