ActAnywhere
ActAnywhere is a subject-aware video background generation model.
CommonProductVideoVideo ProcessingVideo Generation
ActAnywhere is a generative model for automatically generating video backgrounds that match the motion and appearance of the foreground subject. This task involves synthesizing backgrounds that are consistent with the foreground subject's movement and appearance while also aligning with the artist's creative intent. ActAnywhere leverages the power of large-scale video diffusion models, specifically tailored for this task. It takes a sequence of foreground subject segmentation as input, uses an image as a conditioning frame describing the desired scene, and generates a coherent video that aligns with the conditioning frame, achieving realistic foreground-background interaction. The model is trained on a large-scale human-object interaction video dataset. Extensive evaluations demonstrate its superior performance compared to baselines and its ability to generalize to diverse distribution samples, including non-human subjects.