Google, a constant innovator in the field of artificial intelligence, recently announced an exciting initiative. Demis Hassabis, CEO of Google DeepMind, revealed on the podcast Possible that the company will integrate its Gemini AI model with the Veo video generation model. This move aims to enhance Gemini's understanding of the physical world, facilitating the development of a truly versatile digital assistant capable of providing real-world help to users.
Hassabis noted that the Gemini model was designed from the outset as a multimodal system capable of processing various types of data and information. He stated, "Our vision is to build an assistant that can integrate various forms of media, so it can better understand and interact with the world." Currently, Gemini can already generate images, text, and audio, showcasing its powerful multimodal capabilities.
It's noteworthy that the entire AI industry is moving towards "omnipotent" models, with many companies exploring similar avenues. For example, OpenAI's ChatGPT can not only handle text conversations but also generate art-style images. Additionally, Amazon plans to launch a new "any-to-any" model aimed at achieving a higher level of multimodal functionality.
Hassabis revealed that the Veo model's training data primarily comes from Google's YouTube platform. By analyzing a vast amount of YouTube videos, Veo effectively learns the physical laws of the world. He pointed out, "By watching countless videos, Veo2 gains a better understanding of how the real world operates." This indicates that the data used in Veo's training is not only abundant but also practically valuable.
Google expanded its terms of service last year to acquire more YouTube content for AI model training, ensuring the models' diversity and accuracy. This data acquisition strategy will undoubtedly provide a solid foundation for the Gemini and Veo integration, enabling the upcoming smart assistant to understand and respond to user needs more comprehensively and deeply.
With continuous technological advancements, Google's initiative signals that AI assistants will no longer be limited to single tasks but will be able to provide practical support across multiple areas, bringing more convenience to users' lives.