Recently, Google announced the formation of a new team focused on developing artificial intelligence (AI) models that can simulate the physical world. This team will be led by Tim Brooks, who previously served as the co-lead for the video generation model Sora at OpenAI. Brooks stated on the social platform X that this new team will be part of Google’s AI research lab, Google DeepMind.
In his announcement, Brooks mentioned, “DeepMind’s plans are ambitious, aiming to develop large-scale generative models to simulate the world.” He also noted that the team will work closely with Google’s Gemini, Veo, and Genie teams to address “key new challenges” and scale up the models to the highest computational capacities. Gemini is Google’s flagship AI model series, primarily used for image analysis and text generation, while Veo is Google’s proprietary video generation model. As for Genie, it is Google’s attempt at a world model capable of real-time simulation of games and 3D environments.
According to Brooks, the team will develop “real-time interactive generation” tools and explore how to integrate their models with existing multimodal models like Gemini. The job description states, “We believe that scaling AI training based on video and multimodal data is a key pathway to achieving artificial general intelligence (AGI).” AGI refers to AI that can perform any task that a human can do.
Many startups and large tech companies are also pursuing the development of world models, such as World Labs led by renowned AI researcher Fei-Fei Lee, and Israeli startups Decart and Odyssey. These companies believe that future world models can be used to create interactive media, such as video games and movies, as well as conduct real simulations like training robots.
However, the creative sector holds differing views on this technology. A recent survey by Wired magazine revealed that game development companies like Activision Blizzard are leveraging AI to cut costs and improve production efficiency, but this has also led to significant layoffs. According to a 2024 study by the Animation Guild, over 100,000 jobs in the U.S. film, television, and animation industries are expected to be affected by AI by 2026.
Nevertheless, some emerging world modeling startups like Odyssey are committed to collaborating with creative professionals rather than replacing them. Whether this will be Google's approach remains to be seen. Additionally, issues surrounding copyright have yet to be resolved. Some world models may have been trained using unauthorized video game footage, exposing related companies to litigation risks.
Google claims that it has obtained permission to train models on YouTube in accordance with the platform's terms of service, but has not disclosed which specific videos were used.