Tencent Launches New Image-to-Video Model Follow-Your-Pose-v2 Capable of Generating Group Motion Videos

AIbase

Published inAI News · 4 min read · Jun 11, 2024

Website Home (ChinaZ.com) June 11 News: The Tencent Hunyuan team, in collaboration with Sun Yat-sen University and the Hong Kong University of Science and Technology, has introduced a new image-to-video model named "Follow-Your-Pose-v2". This model has achieved a leap from single to multi-person video generation, capable of handling group photos to make everyone move simultaneously in the video.

Key Highlights:

Supports multi-person video action generation: Achieves the generation of multi-person video actions with less inference time.
Strong generalization capability: Generates high-quality videos regardless of age, clothing, ethnicity, background clutter, or action complexity.
Usable with everyday photos/videos: The model can be trained and generate using everyday photos (including snapshots) or videos without the need for high-quality images/videos.
Correctly handles character occlusion: Can generate occlusion scenes with correct front-back relationships when multiple characters' bodies overlap in a single image.

Technical Implementation:

The model utilizes a "flow guidance" to introduce background flow information, enabling the generation of stable background animations even with camera shake or unstable backgrounds.

Through "inference graph guidance" and "depth map guidance", the model can better understand the spatial information of characters in the image and the spatial relationships between multiple characters, effectively solving multi-character animation and body occlusion issues.

Evaluation and Comparison:

The team proposed a new benchmark, Multi-Character, containing approximately 4000 frames of multi-character video to evaluate multi-character generation effects.

Experimental results show that "Follow-Your-Pose-v2" outperforms the latest technology by more than 35% in performance across two public datasets (TikTok and TED Talks) and seven metrics.

Application Prospects:

Image-to-video generation technology has broad application prospects in industries such as film content production, augmented reality, game development, and advertising, making it one of the highly anticipated AI technologies in 2024.

Additional Information:

The Tencent Hunyuan team also released an acceleration library for the text-to-image open-source large model (HunyuanDiT), significantly improving inference efficiency, with image generation time reduced by 75%.

The HunyuanDiT model has lowered the usage threshold, allowing users to call the model with three lines of code from the official model repository on Hugging Face.

Paper link: https://arxiv.org/pdf/2406.03035

Project page: https://top.aibase.com/tool/follow-your-pose

Follow-Your-Pose-v2 Image Generation Video Model Dance Diffusion AI Video Generation

This article is from AIbase Daily

Welcome to the [AI Daily] column! This is your daily guide to exploring the world of artificial intelligence. Every day, we present you with hot topics in the AI field, focusing on developers, helping you understand technical trends, and learning about innovative AI product applications.

—— Created by the AIbase Daily Team

AI News Recommendations

Moonvalley Secures $43 Million in Series B Funding, Launches Innovative Video Generation Model Marey

In April 2025, Moonvalley, a video generation technology company, announced the successful completion of a $43 million Series B funding round. The round involved 11 unnamed investors, bringing the company's total funding to $113 million. This investment will further fuel Moonvalley's technological innovation and market expansion in the AI video generation field. Just ten days prior to the funding announcement, Moonvalley launched its first video generation model, Marey.

Apr 18, 2025

120

Google Veo 2 Launches on AI Studio: Free Trial Opens New Chapter for AI Video Creation

Artificial intelligence video generation technology is changing the landscape of content creation at an unprecedented rate. AIbase learned from social media that Google Veo 2 has officially launched on Google AI Studio, offering users free trial access. This news has generated considerable enthusiasm among developers and creators, marking not only a key step in Veo 2's popularization but also providing a low-threshold testing environment for AI video generation. Below is AIbase's in-depth report on this development, analyzing Veo 2's free trial mechanism and...

Apr 16, 2025

200

ByteDance Releases Seaweed-7B Video Model: AI Video Generation Reaches New Heights

A new milestone has been reached in the field of AI video generation. AIbase learned from social media that ByteDance recently released a paper and demo of its new video generation model, Seaweed-7B, showcasing groundbreaking capabilities including synchronized audio and video generation, long-shot storytelling, and real-time high-resolution generation. This release signifies ByteDance's accelerated deployment in AI video technology. Below is AIbase's in-depth report on Seaweed-7B, analyzing its technological highlights and industry impact. Seaweed-7B is groundbreaking.

Apr 15, 2025

1.7k

Higgsfield Mix Revolutionizes Cinematography: AI-Powered Virtual Camera Transcends Physical Limitations

Higgsfield, an innovative AI video generation company, recently unveiled Higgsfield Mix, a groundbreaking technology that completely overturns the physical limitations of traditional cameras. According to AIbase, this technology allows users to combine multiple motion controls in a single shot, creating dynamic effects impossible with real cameras. Higgsfield also introduced 10 new motion control modes specifically designed to enhance speed, tension, and cinematic impact, empowering film creation and numerous other applications.

Apr 11, 2025

220

Pika Launches Groundbreaking Hyperrealistic Control Technology: Pika Twists - Ushering in a New Era for AI Video Editing

AI video generation platform Pika recently unveiled a revolutionary new technology enabling users to manipulate any character or object within a video in hyperrealistic ways. This groundbreaking feature has quickly garnered enthusiastic responses from creators worldwide. According to AIbase, Pika's technology achieves remarkably realistic video editing effects. Showreel examples from its creator community are breathtaking, showcasing the limitless potential of AI in video content creation. Hyperrealistic control: A new video editing experience. Pika's new technology leverages advanced A...

Apr 11, 2025

250

Synthesia, AI Avatar Generator, Partners with Shutterstock for Video Content Licensing

UK-based startup Synthesia, which uses AI to generate realistic avatars, has signed a licensing agreement with US stock video company Shutterstock to leverage Shutterstock's extensive video library to enhance the realism of its technology. While the financial terms of the deal remain undisclosed, Synthesia stated that this will help their latest AI model better capture human expressions, vocal tones, and body language. Synthesia creates digital...

Apr 10, 2025

270

Veo 2 Launches on Gemini API: Revolutionizing AI Video Generation

Google's AI team recently announced the launch of Veo 2, its highly anticipated video generation model, via the Gemini API. This news has generated significant excitement in the tech world, marking a new era for AI video generation. Starting today, developers with billing enabled and Tier 1 access or higher can utilize the API to access Veo 2 and experience its powerful Text-to-Video and Image-to-Video capabilities.

Apr 10, 2025

390

AI Video Generation Technology TTT: Generates One-Minute Complete Tom and Jerry Animations Directly, No Editing or Splicing Needed

A new research paper titled "One-Minute Video Generation with Test-Time Training" has been released, marking a significant advancement in AI video generation technology. This research successfully generates one-minute Tom and Jerry animations by introducing an innovative Test-Time Training (TTT) layer into a pre-trained Transformer model.

Apr 9, 2025

1.2k

Runway Releases Gen-4 Turbo: AI Video Generation Speed Reaches New Heights

Apr 8, 2025

550

Alibaba Unveils OmniTalker: A Breakthrough in AI Video Generation, Achieving Stylized Speech and Expression Synchronization with a Single Reference Video

Recently, a research team from Alibaba Group released OmniTalker, a new AI technology project that has quickly garnered industry attention for its impressive video generation capabilities. OmniTalker can accurately capture the speech style and facial expressions of a person from a single reference video and generate a dynamic video with synchronized lip movements and natural expressions. This technology showcases Alibaba's strength in generative AI and offers revolutionary possibilities for video content creation.

Apr 7, 2025

810

AI News

AI Daily

AI Timeline

Al Hardware

Latest Cases

Image Collection

Video Collection

Audio Collection

Content Collection

Latest Tutorials

AI Product Ranking

AI Traffic Growth Ranking

AI Traffic Decline Ranking

AI Weekly Ranking

United States

China

India

Brazil

Image Generation

Personal Assistant

Character Generation

Video Generation

AI Project Ranking

AI Project Growth Ranking

AI Developer Ranking

AI Organization Ranking

Deepseek

TTS

LLM

ChatGPT

Overview