On September 25th, OpenAI introduced multi-modal capabilities including voice interaction and image recognition to its popular conversational AI, ChatGPT. These new features allow users to engage through voice conversations and image uploads, enabling functionalities such as voice recognition, text recognition, and object detection. The multi-modal version of ChatGPT is named GPT-4V, which was trained concurrently with GPT-4 but delayed in release due to considerations of safety. OpenAI stated that the new features will initially be rolled out to ChatGPT Plus subscribers and enterprise users.