Tencent has recently launched an innovative video model called GameGen-O, the first diffusion transformation model in the industry specifically designed for open-world video games. Unlike traditional video models, GameGen-O not only generates high-quality game content but also allows users to control the actions of characters in real-time, much like manipulating game characters, marking a new era in AI-game interaction.
Official Demonstration Video
The core advantage of GameGen-O lies in its diverse content generation capabilities and unprecedented interactive control. Users can create various characters such as "Geralt of Rivia" or "Arthur Morgan" and place them in environments with seasonal changes, showcasing diverse scenes like "motorcycling" or "rain." More excitingly, GameGen-O supports open-domain generation, allowing users to direct the model to generate corresponding video segments in real-time through structured instructions and operation signals, as if directing their own virtual world.
To achieve this groundbreaking technology, Tencent's team has made significant efforts. They constructed the first open-world video game dataset (OGameData), collecting data from hundreds of next-generation open-world games. After rigorous screening and processing, they selected approximately 15,000 high-quality videos from 32,000 original ones. These videos underwent multiple processing steps including scene detection, aesthetic evaluation, optical flow analysis, and semantic content screening, and were finally structured and annotated by expert models and multi-modal large models, providing a refined and interactive data foundation for model training.
The training process of GameGen-O is also unique, divided into two stages: basic model pre-training and instruction fine-tuning. During pre-training, the model learns open-domain video game generation capabilities through text-to-video and video continuation tasks. In the instruction fine-tuning stage, the research team froze the pre-trained model and introduced a trainable InstructNet for fine-tuning, enabling the model to generate subsequent frames based on multi-modal structured instructions, thus achieving instruction-based video generation and interactive control.
Although GameGen-O still has room for improvement in certain aspects, it is undoubtedly a significant milestone in AI-driven game content creation. This technology not only provides powerful tools for game developers but also opens a new era for ordinary users to freely create and explore virtual worlds. With continuous technological improvements, we can expect that in the near future, everyone will be able to easily create their own immersive gaming experiences.
The emergence of GameGen-O marks another deep integration of the gaming industry and the AI field. It not only showcases Tencent's strength in AI technology but also points the way for future industry development. We look forward to seeing how this technology will change the landscape of game creation and bring more surprises and possibilities to players.
Project Link: https://top.aibase.com/tool/gamegen-o