On September 25th, OpenAI introduced multi-modal capabilities including voice interaction and image recognition to its popular conversational AI, ChatGPT. These new features allow users to engage through voice conversations and image uploads, enabling functionalities such as voice recognition, text recognition, and object detection. The multi-modal version of ChatGPT is named GPT-4V, which was trained concurrently with GPT-4 but delayed in release due to considerations of safety. OpenAI stated that the new features will initially be rolled out to ChatGPT Plus subscribers and enterprise users.
OpenAI ChatGPT Multimodal Features Officially Launched, Supporting Voice Interaction and Image Recognition

智能涌现
This article is from AIbase Daily
Welcome to the [AI Daily] column! This is your daily guide to exploring the world of artificial intelligence. Every day, we present you with hot topics in the AI field, focusing on developers, helping you understand technical trends, and learning about innovative AI product applications.