Recently, Waymo has taken another significant step in the field of autonomous driving. The company has always regarded its collaboration with Google DeepMind as a competitive advantage, and now, it is leveraging Google's multimodal large language model, Gemini, to enhance the training effectiveness of its self-driving taxis.
Waymo has published a new research paper introducing an "end-to-end multimodal model" named EMMA, which can process sensor data to generate future driving trajectories for autonomous vehicles. This means Waymo's无人驾驶 vehicles can make smarter driving decisions and effectively avoid obstacles.
The significance of this new technology lies not only in its innovation but also in its potential to change the application scope of most large language models. Waymo hopes to treat MLLM as a "first-class citizen" in its autonomous driving systems, indicating that future autonomous driving may be quite different from current chatbots or image generators.
In this paper, Waymo mentioned that traditional autonomous driving systems typically develop specific "modules" for various functions, including perception, mapping, prediction, and planning. Although this approach has made some progress in recent years, its limitations are also evident, especially in dealing with new complex environments. Waymo believes that MLLMs like Gemini can address these issues because they possess extensive "world knowledge" and can perform "chain-of-thought reasoning," mimicking human logical reasoning.
The EMMA model was developed to help Waymo's autonomous taxis navigate complex environments. For example, when encountering animals or road construction, EMMA can help无人驾驶 cars find the best driving paths. However, Waymo also acknowledges that EMMA has some limitations, such as currently being unable to process 3D sensor inputs from lidar or radar.
Waymo's research in this area needs to be further deepened, but they hope this achievement will inspire more research to address current issues and drive the development of autonomous driving technology.
Key Points:
🚗 Waymo is utilizing Google's Gemini model to develop a new autonomous taxi training system, EMMA, enhancing decision-making capabilities.
🌍 The EMMA model can process complex sensor data, helping无人驾驶 vehicles intelligently avoid obstacles.
🔍 While EMMA holds potential, Waymo acknowledges the need for further research to overcome its existing limitations.