OpenAI announced this Thursday that the humanoid conversation feature "Advanced Voice Mode" developed for ChatGPT has been visualized. Users subscribed to ChatGPT Plus, Team, or Pro can now point their phone cameras at objects, and ChatGPT will respond almost in real-time.

This advanced voice mode with visual capabilities also has screen sharing abilities, allowing it to analyze content on the device's screen. For example, it can explain various settings menus and provide suggestions for math problems.

Using it is very simple: click the voice icon next to the ChatGPT chat bar, then click the video icon in the lower left corner to start the video. To share your screen, click the three-dot menu and select "Share Screen".

Regarding the feature rollout, OpenAI stated that the visual advanced voice mode will start launching this Thursday and will conclude next week. It is important to note that not all users will have immediate access. ChatGPT Enterprise and Edu users will have to wait until January next year, while users in the EU, Switzerland, Iceland, Norway, and Liechtenstein have not yet been given a specific timeline.

In a recent episode of CNN's "60 Minutes," OpenAI President Greg Brockman demonstrated the visual analysis capabilities of the advanced voice mode to Anderson Cooper. When Cooper drew human body parts on a blackboard, ChatGPT was able to understand and comment on his drawings. For example, it accurately pointed out the location of the brain and suggested that its shape is closer to an oval.

However, during the demonstration, this advanced voice mode also revealed some inaccuracies with geometric questions, indicating a potential risk of generating "hallucinations."

QQ20241213-090150.png

It is worth mentioning that this advanced voice mode with visual capabilities has been delayed multiple times. In April of this year, OpenAI promised to launch it "within a few weeks," but later stated that more time was needed. It wasn't until early this fall that the feature was made available to some ChatGPT users, and it did not yet have visual analysis capabilities at that time.

In the context of increasing competition in artificial intelligence, competitors like Google and Meta are also developing similar features. This week, Google opened its real-time video analysis conversational AI project, Project Astra, to some Android testers.

In addition to the visual feature, OpenAI also launched a holiday "Santa Mode" this Thursday, which users can activate by clicking the snowflake icon next to the prompt bar in the ChatGPT app.