Recently, ByteDance has launched the latest portrait animation technology, X-Portrait2, aimed at creating expressive and realistic character animations at a very low cost and with high efficiency. Users only need to provide a static portrait image and a performance-driven video; X-Portrait2 can then generate an animated video by transferring the expressions from the video to the portrait, thereby simplifying the complex processes of traditional motion capture and character animation.
The core of this technology lies in its advanced expression encoder model, which has been trained on a large dataset and can implicitly encode subtle expressions from the input. Combined with a powerful generative diffusion model, X-Portrait2 can produce smooth and highly expressive videos, transferring the actor's minute facial expressions, including challenging ones like pouting, sticking out the tongue, puffing cheeks, and frowning. Moreover, the generated videos maintain high fidelity in emotional expression.
During the training of the expression encoder, the development team ensured a strong separation of appearance and motion, allowing the encoder to focus on expression-related information in the video. This design enables the model to achieve cross-style and cross-domain expression transfer, suitable for various scenarios such as real storytelling, character animation, virtual agents, and visual effects.
Compared to existing state-of-the-art methods like X-Portrait and Runway Act-One, X-Portrait2 demonstrates higher accuracy in quick head movements, subtle expression changes, and personal emotion transmission—aspects crucial for high-quality animation content creation, such as in animation and film production.
Address: https://byteaigc.github.io/X-Portrait2/