Alibaba Open-Sources Wan2.1-FLF2V-14B: A Breakthrough in Generating 720p HD Videos from First and Last Frames

AIbase基地

Published inAI News · 7 min read · Apr 18, 2025

Alibaba recently announced the open-sourcing of its latest first-last frame video generation model, Wan2.1-FLF2V-14B, capable of generating 5-second, 720p high-definition videos. This model has garnered significant attention for its innovative first-last frame control technology, opening up new possibilities in the field of AI video generation. According to AIbase, the model was launched on GitHub and Hugging Face in February 2025, available for free use by global developers, researchers, and commercial organizations, marking another important milestone in Alibaba's open-source AI ecosystem development.

Core Function: First-Last Frame Driven, Generating Smooth High-Definition Videos

Wan2.1-FLF2V-14B uses the first and last frames as control conditions. Users only need to provide two images, and the model automatically generates a 5-second, 720p video. AIbase observed that the generated videos exhibit excellent smoothness and seamless transitions between the first and last frames. The image details closely resemble the reference images, and the overall content consistency is significantly improved. Compared to traditional video generation models, this model, through precise conditional control, solves the common problems of image jitter and content drift in long-sequence video generation, providing an efficient solution for high-quality video creation.

Technical Highlights: CLIP and DiT Integration Enhance Generation Stability

According to AIbase analysis, Wan2.1-FLF2V-14B employs advanced first-last frame conditional control technology, primarily based on the following innovations:

CLIP Semantic Feature Extraction: The CLIP model extracts semantic information from the first and last frames to ensure that the generated video is highly consistent with the input images in terms of visual content.

Cross-Attention Mechanism: The first and last frame features are injected into the Diffusion Transformer (DiT) generation process to enhance image stability and the coherence of the time series.

Data-Driven Training: The model is trained on a massive dataset of 150 million videos and 1 billion images, enabling it to generate dynamic content that conforms to real-world physical laws.

The combination of these technologies enables Wan2.1-FLF2V-14B to excel in generating complex motion scenes, making it particularly suitable for creative applications requiring high-fidelity transitions.

Wide Applications: Empowering Content Creation and Research

The open-sourcing of Wan2.1-FLF2V-14B offers vast application prospects across multiple fields. AIbase has summarized its main application scenarios:

Film and Advertising: Quickly generate high-quality transition videos, reducing post-production costs.

Game Development: Generate dynamic cutscenes for game environments, improving development efficiency.

Education and Research: Support researchers in exploring video generation technology and developing new AI applications.

Personalized Creation: Ordinary users can generate personalized short videos through simple input, enriching social media content.

It's worth noting that the model supports Chinese prompt generation and performs even better when handling Chinese scenarios, demonstrating its adaptability to multilingual environments.

Ease of Use: Adaptable to Consumer-Grade Hardware

Wan2.1-FLF2V-14B demonstrates high versatility in hardware requirements. AIbase understands that despite its relatively large scale of 1.4 billion parameters, through optimization, the model can run on devices equipped with consumer-grade GPUs such as RTX 4090, with memory requirements as low as 8.19 GB. Generating a 5-second 480p video takes approximately 4 minutes, and the generation time for 720p videos remains within a reasonable range. Furthermore, the model provides detailed deployment instructions. Users can quickly start using it with the following command:

python
python generate.py --task flf2v-14B --size 1280*720 --ckpt_dir ./Wan2.1-FLF2V-14B --first_frame examples/first.jpg --last_frame examples/last.jpg --prompt "A smooth transition from a sunny beach to a starry night"

The open-source community also provides a Gradio-based web UI, further reducing the barrier to entry for non-technical users.

Community Feedback and Future Outlook

Since its release, Wan2.1-FLF2V-14B has generated enthusiastic responses in the open-source community. Developers highly praise its generation quality, hardware friendliness, and open-source strategy. AIbase has noted that the community has begun secondary development around the model, exploring more complex video editing functions, such as dynamic subtitle generation and multilingual dubbing. In the future, Alibaba plans to further optimize the model to support higher resolutions (such as 8K) and longer video generation, while also expanding its applications in areas such as video-to-audio (V2A).

Project Address: https://github.com/Wan-Video/Wan2.1

Alibaba's Tongyi Wanxiang 2.1 Open-Source First and Last Frame Video Generation Model Wan2.1-FLF2V-14B

Alibaba's Tongyi announced the open-sourcing of the Wan2.1 series of models, including a powerful first and last frame video generation model. This model utilizes the advanced DiT architecture, achieving several technological breakthroughs. It significantly reduces the computational cost of generating high-definition videos while ensuring high temporal and spatial consistency. This open-sourcing provides developers and creators with powerful tools, driving the advancement of video generation technology.

Alibaba's AI Model Receives FDA Breakthrough Device Designation

Alibaba's AI model, DAMO PANDA, has recently received Breakthrough Device Designation from the U.S. Food and Drug Administration (FDA). This achievement marks a significant breakthrough in early pancreatic cancer screening, offering new possibilities for early diagnosis. DAMO PANDA, developed by Alibaba's DAMO Academy, is an AI model focused on pancreatic cancer screening. Its primary function is to analyze plain CT images and accurately identify subtle lesions that are difficult for the human eye to detect.

Alibaba's AI Model DAMO PANDA Receives FDA Breakthrough Device Designation for Pancreatic Cancer Early Detection

Alibaba's AI model, DAMO PANDA, has been officially designated a breakthrough medical device by the U.S. Food and Drug Administration (FDA). This certification marks a significant breakthrough for Alibaba in the field of AI-powered healthcare and represents the first time a leading Chinese tech company has received this prestigious recognition. DAMO PANDA, developed by Alibaba's DAMO Academy, is an AI model designed for pancreatic cancer screening. It aims to accurately identify subtle lesions in CT images, aiding in early detection of the disease.

Yiwu Mall Group Integrates Alibaba's Tongyi Large Model to Create AI-Powered Business Assistant

Yiwu Mall Group announced its official integration with Alibaba's Tongyi large language model. Leveraging Alibaba's strengths in cloud computing, big data, and e-commerce, this collaboration will empower 2.1 million small and medium-sized merchants to leverage AI for precise business operations and rapid expansion into overseas markets. This partnership marks a significant step forward in Yiwu Mall Group's digital transformation and globalization strategy, and also highlights Alibaba's crucial role in driving the digital transformation of SMEs.

Alibaba Cloud AIStack Large Model Appliance Makes Debut, Offering Cost-Effective AI Solutions for Enterprises

At the 8th Digital China Summit, Alibaba Cloud unveiled its new AIStack large model appliance, marking another significant advancement in its enterprise-grade AI solutions. This appliance integrates hardware and software for deep optimization, aiming to provide lightweight and cost-effective intelligent services to various industries including government, energy, and healthcare. The launch of AIStack is Alibaba Cloud's positive response to market demand for efficient and economical AI services. Designed specifically for enterprises, AIStack...

Jack Ma Reiterates Focus on AI; Alibaba's All-in AI Strategy Draws Attention; Employees Say Performance Not Yet Tied to AI

Alibaba founder Jack Ma recently addressed company employees, reaffirming the importance of artificial intelligence and stating that AI's future role is to liberate, not replace, humanity. Previous market rumors suggested that all Alibaba departments would have AI-driven growth as a core performance metric by 2025. However, an Alibaba employee told the media that performance evaluations are not currently directly linked to AI, which remains an auxiliary tool. In response to inquiries, Alibaba stated that this was not an official announcement.

AI Daily: Zhipu AI Opens Sources 32B/9B GLM Series Models and Launches Z.ai Domain; OpenAI Releases GPT-4.1 Series Models; Alibaba ModelScope Launches MCP Plaza

Welcome to the "AI Daily" column! Your daily guide to exploring the world of artificial intelligence. We present you with the hottest AI topics, focusing on developers, helping you understand technology trends and learn about innovative AI product applications. Discover new AI products here: https://top.aibase.com/ 1. Zhipu AI Launches New Domain Z.ai and Open Sources 32B/9B Series GLM Models Zhipu AI team recently announced the open sourcing of 32B and 9B series GLM models and launched a new interactive...

AI News

AI Daily

AI Timeline

Al Hardware

Latest Cases

Image Collection

Video Collection

Audio Collection

Content Collection

Latest Tutorials

AI Product Ranking

AI Traffic Growth Ranking

AI Traffic Decline Ranking

AI Weekly Ranking

United States

China

India

Brazil

Image Generation

Personal Assistant

Character Generation

Video Generation

AI Project Ranking

AI Project Growth Ranking

AI Developer Ranking

AI Organization Ranking

Deepseek

TTS

LLM

ChatGPT

Overview

Alibaba Open-Sources Wan2.1-FLF2V-14B: A Breakthrough in Generating 720p HD Videos from First and Last Frames

AIbase基地

This article is from AIbase Daily

AI News Recommendations

AI Daily: Alibaba's Tongyi Wanxiang First and Last Frame Video Generation Model; Doubao Open-Sources Seed Agent Model UI-TARS-1.5; OpenAI Releases First Intelligent Agent Practice Guide

Alibaba Tongyi Wanxiang First and Last Frame Video Generation Model Wan2.1-FLF2V-14B Open Source