According to a report by The Information, insiders have revealed that OpenAI plans to launch a multi-modal AI system named GPT-Vision, competing with Google's recently released multi-modal large model Gemini for enterprise testing. When OpenAI released GPT-4 in March, it previewed multi-modal capabilities but has only made them available to a select few businesses so far. Six months later, OpenAI is preparing to roll out GPT-Vision on a broader scale. The delay was mainly due to OpenAI's concern about potential misuse of the new features. Additionally, OpenAI is developing a more powerful multi-modal model codenamed Gobi. OpenAI's proactive push for the commercial application of multi-modal AI marks the entry of multi-modal AI into practical application stages. Industry insiders believe that visual capabilities such as image generation will enhance the commercial value of AI systems, and OpenAI's GPT-Vision has the potential to rival Google. The competition between the two major giants in the AI field is conducive to technological advancement.