Shanghai Jieyue Xingchen Intelligent Technology Co., Ltd. proudly announces the open-sourcing of its latest image-to-video model: Step-Video-TI2V. This model, trained on the 30B parameter Step-Video-T2V, generates 102-frame, 5-second videos at 540P resolution. Its core features include controllable motion amplitude and controllable camera movement, particularly excelling in anime-style video generation. Compared to existing open-source image-to-video models, Step-Video-TI2V not only boasts a larger parameter scale but also offers superior control over motion amplitude, balancing dynamism and stability in the generated videos, providing creators with more flexible options.

微信截图_20250320143140.png

Two key optimizations were implemented during Step-Video-TI2V's development. First, image conditioning was introduced to enhance the consistency between the generated video and the original image. Unlike traditional cross-attention methods, this model directly concatenates the vector representation of the image with the vector representation of the first frame of DiT at the channel dimension, ensuring high consistency between the generated video and the input image. Second, a video dynamic score is introduced via the AdaLN module, allowing users to specify different motion levels during video generation and precisely control the video's dynamic amplitude, balancing dynamism, stability, and consistency. Furthermore, the team conducted specialized and precise annotation of subject movements and camera movements, further improving the model's performance in subject dynamism and camera work.

Step-Video-TI2V's core features include controllable motion amplitude, various camera movement controls, excellent anime-style results, and support for multiple output sizes. Users can freely switch between dynamic and stable footage based on their creative needs, generating videos ranging from basic pan, tilt, zoom, and dolly shots to complex cinematic camera movements. The model particularly excels in anime-related tasks, making it ideal for animation creation and short video production. It also supports various video aspect ratios, accommodating different platform requirements, including landscape, portrait, and square formats.

Experience it here:

https://yuewen.cn/videos

GitHub:

https://github.com/stepfun-ai/Step-Video-TI2V

Github-ComfyUI:

https://github.com/stepfun-ai/ComfyUI-StepVideo