OpenAI Launches Multimodal Model GPT-Vision to Compete with Google's Gemini

站长之家

Published inAI News · 1 min read · Sep 20, 2023

OpenAI is gearing up to launch a multimodal model named GPT-Vision, competing with Google's Gemini. GPT-Vision will enable GPT-4 to have broader image application capabilities, generating text related to image content. Additionally, OpenAI is also developing a multimodal AI model named Gobi, which could potentially be the next iteration, GPT-5. OpenAI plans to announce new features of GPT-4 at their developer conference on November 6th. The competition between OpenAI and Google will drive advancements in AI technology, ultimately benefiting consumers.

OpenAI GPT-Vision Gemini

This article is from AIbase Daily

Welcome to the [AI Daily] column! This is your daily guide to exploring the world of artificial intelligence. Every day, we present you with hot topics in the AI field, focusing on developers, helping you understand technical trends, and learning about innovative AI product applications.

—— Created by the AIbase Daily Team

AI News Recommendations

OpenAI Strongly Hires Four Top Engineers to Join and Assist the Gateway to the Stars Project

Jul 9, 2025

Microsoft, OpenAI, and Anthropic Launch AI Training Center for Educators

Jul 9, 2025

OpenAI Partners with the National Education Association to Launch the National AI Teaching Academy

OpenAI partners with AFT to launch 'National AI Teaching Academy,' aiming to train 400K teachers in 5 years with $10M funding. Focused on AI integration, especially in underserved districts, the program offers workshops, online courses, and OpenAI tool access. First center in NY, expanding by 2030, with Microsoft support.....

Jul 9, 2025

OpenAI Strengthens Security Measures: New Policy Helps Protect Sensitive Information

OpenAI enhances security with 'information isolation' to limit employee access to sensitive algorithms, implements physical safeguards like biometric access, adopts 'default deny' network policies, and expands security teams to protect intellectual property amid rising espionage risks.....

Jul 9, 2025

OpenAI CEO Altman Looks Forward to Meeting Zuckerberg as Talent War Intensifies

Meta CEO Zuckerberg is offering top OpenAI engineers a $100 million salary in an attempt to attract them, triggering a talent war in the tech industry. OpenAI CEO Altman stated that he will retain talent through a sense of mission, team excellence, and a world-class laboratory, and remains confident in the company's future. The two are set to meet for the first time at the Sun Valley Summit, and Altman also emphasized that cooperation between the government and enterprises is needed for AI infrastructure development. This talent battle may reshape the tech industry landscape.

Jul 9, 2025

Microsoft Launches Deep Research: Integration of Bing and OpenAI to Revolutionize Automated Research

Microsoft launches the Deep Research research tool, which integrates Bing search and OpenAI technology to automate research. The tool uses the core technology o3-deep-research, with a workflow that includes four key steps: first, interacting with GPT-4o/4.1 to clarify user requirements; second, calling Bing to retrieve the latest data; third, performing intelligent analysis and reasoning; finally, generating a structured report containing answers, reasoning process, cited sources, and clarification records. The tool supports integration with Azure AI

Jul 8, 2025

Google Gemini AI Raises Privacy Concerns! User Settings Ineffective, Deep Integration with Third-Party Applications

Google recently introduced a new strategy that allows the Gemini AI assistant to access third-party applications such as WhatsApp, raising privacy concerns. Even if users disable interaction permissions, Gemini can still access some application data, which will be reviewed by humans and retained for 72 hours. Although an option to disable it is provided, the instructions are unclear, making it difficult for users to confirm whether the permission has been successfully disabled. Experts have criticized this strategy as similar to Microsoft's forced bundling of IE, and suggest uninstalling Gemini via the command line, although the process is complex. This incident highlights the challenges faced by tech companies.

Jul 8, 2025

Stream-Omni: Supports Various Modalities Combination Interaction, Opening the Era of Text, Vision, and Speech Integration

CASIC introduces Stream-Omni, a multimodal model supporting text, vision, and speech. It uses targeted alignment to reduce data dependency, excels in cross-modal tasks, and offers open-source resources.....

Jul 7, 2025

420

OpenAI Announces GPT-5 Will Integrate Multiple Models for a New Breakthrough

OpenAI plans to launch GPT-5 this summer, integrating multiple model capabilities. The new version combines reasoning of 'O-series' and GPT's multimodal advantages for enhanced performance. It aims to simplify user experience by eliminating model switching. GPT-5 will boost functionality and usability, though exact release timing remains unclear.....

Jul 7, 2025

2.0k

OpenAI Takes a Unique Approach with a Researcher Residency Program to Attract Emerging AI Talent

OpenAI launches 'Residency Program' offering $210K/year to attract cross-disciplinary AI talent. The 6-month program trains 30 researchers annually from fields like physics and neuroscience, with full-time conversion for top performers. Unlike Meta's high-cost recruitment, OpenAI prioritizes cultural alignment over salary to retain talent.....

Jul 7, 2025

450

Product Finder

Product Submit

AI Models Finder

MCP Servers

MCP Client

MCP Inspector

Case Tutorials

Latest AI News

AI Daily Brief

OpenAI Launches Multimodal Model GPT-Vision to Compete with Google's Gemini

站长之家

This article is from AIbase Daily

AI News Recommendations

OpenAI Strongly Hires Four Top Engineers to Join and Assist the Gateway to the Stars Project

Microsoft, OpenAI, and Anthropic Launch AI Training Center for Educators

OpenAI Partners with the National Education Association to Launch the National AI Teaching Academy

OpenAI Strengthens Security Measures: New Policy Helps Protect Sensitive Information

OpenAI CEO Altman Looks Forward to Meeting Zuckerberg as Talent War Intensifies

Microsoft Launches Deep Research: Integration of Bing and OpenAI to Revolutionize Automated Research

Google Gemini AI Raises Privacy Concerns! User Settings Ineffective, Deep Integration with Third-Party Applications

Stream-Omni: Supports Various Modalities Combination Interaction, Opening the Era of Text, Vision, and Speech Integration

OpenAI Announces GPT-5 Will Integrate Multiple Models for a New Breakthrough

OpenAI Takes a Unique Approach with a Researcher Residency Program to Attract Emerging AI Talent