A research team from Nanyang Technological University in Singapore recently unveiled an AI technology called SOLAMI, which can create lifelike 3D virtual characters that not only interact with you in real-time in a VR world but also understand your voice and movements, allowing you to chat, dance, and even box with them! This is truly a boon for gaming, virtual socializing, and singles!
SOLAMI is an end-to-end Visual-Language-Action (VLA) modeling framework that utilizes deep learning technology to convert user speech and actions into a "language" that virtual characters can understand, generating corresponding speech and action responses. In simpler terms, it translates your voice and movements into a language that AI can comprehend, enabling virtual characters to respond in a way that feels as natural and smooth as real humans, moving away from the stiffness and mechanical feel of past AI characters.
To train this AI "social expert," the research team put in a lot of effort.
They created a synthetic dataset named SynMSI, which contains a vast amount of dialogue, actions, and voice data. This data wasn't randomly collected; it was meticulously designed and processed using existing action databases and powerful language models.
Even more impressive, SOLAMI is equipped with a VR interface that allows you to interact with virtual characters in an immersive way.
When you wear a VR device, you can see a virtual character standing in front of you, chatting and moving as if you are in a real social setting.
The research team states that the application prospects of SOLAMI are very broad and may revolutionize multiple fields such as gaming, virtual socializing, and education and training.
For instance, NPC characters in games can become smarter and interact with you more like real people; virtual avatars on social platforms can be more personalized, helping you find like-minded friends in the virtual world; it could even create virtual teachers, making learning more lively and engaging.
Of course, SOLAMI is still in the research phase, but its enormous potential has already excited the tech community.
The research team has conducted a series of experiments demonstrating that SOLAMI outperforms existing methods in terms of action quality, speech quality, and response speed. More importantly, user test results show that people are very satisfied with the virtual characters created by SOLAMI, indicating that the era of "AI spouses" is indeed approaching!
Key Highlights of SOLAMI Technology:
End-to-End VLA Model: Directly converts user speech and actions into virtual characters' speech and action responses, achieving a natural and smooth interaction experience.
SynMSI Synthetic Dataset: Automatically generates a large amount of multi-turn multimodal dialogue data using existing action datasets and large language models, addressing the problem of insufficient training data.
Immersive VR Interface: Users can engage in face-to-face communication with virtual characters through VR devices, experiencing a more authentic interaction.
Smarter and More Human-like: SOLAMI can create more intelligent and realistic virtual characters, making the virtual interaction experience feel more "human."
The research team believes that the application prospects of SOLAMI technology are very promising, as it can be utilized in gaming, virtual socializing, education, and training, among other fields. For example, in gaming, SOLAMI can create smarter and more realistic NPC characters, enhancing players' gaming experiences; in virtual socializing, SOLAMI can help users create more personalized virtual avatars, increasing the immersion of virtual social interactions; in education and training, SOLAMI can create more engaging virtual teachers to improve teaching effectiveness.
The research team has also conducted a series of experiments, showing that SOLAMI technology excels in action quality, speech quality, and reasoning latency compared to existing methods. User research also indicates that users have a very high satisfaction level with the 3D virtual characters constructed by SOLAMI.
Currently, SOLAMI technology is still in the research phase, but its future development potential is immense, promising to deliver a more intelligent and humanized virtual interaction experience.
Project Homepage: https://solami-ai.github.io/
Technical Report: https://arxiv.org/abs/2412.00174
Full Introduction Video: https://www.bilibili.com/video/BV1D6zpYHEyc/