In the digital era, virtual reality (VR) and augmented reality (AR) technologies are gradually transforming our lifestyles and work patterns. Imagine creating a 3D avatar that mimics every movement and expression of yours—what an experience that would be!

ExAvatar, jointly developed by DGIST and Meta's Codec Avatars Lab, is turning this imagination into reality. This technology can capture your full-body movements, facial expressions, and even hand gestures through a video and transform them into a lifelike 3D digital figure.

Two innovative aspects of ExAvatar: Firstly, it employs the SMPL-X full-body parametric mesh model to accurately capture and reproduce various human poses; secondly, it integrates 3D Gaussian Splatting (3DGS) technology, endowing ExAvatar with more realistic and efficient rendering capabilities.

Key Features:

Full-body 3D animation: Supports comprehensive animation of the body, hands, and face, generating a variety of poses and expressions.

Hybrid representation: Combines 3D Gaussians and surface meshes to ensure geometric and appearance consistency, reducing artifacts.

Convenient capture: A 3D avatar can be created through a short mobile phone scan, with simple operation.

High-quality rendering: Utilizes advanced algorithms and techniques to achieve high-quality dynamic performance and rendering effects.

Superior to existing technologies: Surpasses previous 3D avatar generation technologies in natural movement and appearance, suitable for a wider range of application scenarios.

ExAvatar addresses some challenges of previous technologies, such as insufficient diversity in facial expressions and poses, and the lack of 3D scans and depth images. Through hybrid representation, ExAvatar enhances the naturalness of animations and reduces artifacts that may appear in new poses.

image.png

Before training ExAvatar, the research team jointly calibrated the body, hands, and face using the SMPL-X model, introduced joint offsets and facial offsets, optimized hand bone lengths and facial area shapes, enhancing the avatar's performance capabilities and naturalness.

ExAvatar's technical architecture extracts features from each Gaussian and processes them through a multi-layer perceptron (MLP), combining with standard meshes to form a 3D avatar that can be animated in standard space. Animation is handled using the Linear Blend Skinning (LBS) algorithm, and the avatar is rendered onto the screen using 3DGS technology, ensuring high-quality visual effects.

The convenience of ExAvatar lies in users being able to create a 3D avatar through a simple mobile phone scan, supporting animation with novel body poses, gestures, and facial expressions, and rendering from any perspective. This technology, through a hybrid representation method, treats each 3D Gaussian as a vertex on the surface, with predefined connectivity between these vertices, matching the mesh topology of SMPL-X.

Project link: https://mks0601.github.io/ExAvatar/

Paper link: https://arxiv.org/pdf/2407.21686