Recently, there have been reports that OpenAI is set to launch an AI tool called "Operator," which is capable of controlling personal computers and executing tasks on behalf of users. Software engineer Tibor Blaho revealed this information on social media, stating that he discovered the latest clues about this tool. Previously, several media outlets, including Bloomberg, had reported rumors about "Operator," claiming it could autonomously perform various tasks such as coding and booking travel.

image.png

According to Blaho, OpenAI plans to release "Operator" in January 2025. He found hidden options in the ChatGPT macOS client that allow users to define shortcuts for "switching Operator" and "force quitting Operator." Additionally, relevant information about "Operator" has appeared on OpenAI's website, although this information has not yet been made public.

Blaho also mentioned that there are some tables on OpenAI's website comparing "Operator" with other AI systems used for computer tasks. These tables may only serve as placeholders. If the data in the tables is accurate, it shows that "Operator" does not always perform reliably, depending on the specific tasks being executed.

image.png

In a benchmark test conducted by OSWorld, the score for "OpenAI Computer Using Agent (CUA)" was 38.1%, which, while surpassing Anthropic's computer control model, is still significantly lower than the human score of 72.4%. In the WebVoyager test, Operator outperformed humans, while in the WebArena test, it fell short of human performance. For some simple tasks, such as signing up for cloud service providers and launching virtual machines, Operator had a success rate of only 60%; for creating a Bitcoin wallet, the success rate was just 10%.

OpenAI's entry into the AI agent market comes at a time when competitors like Anthropic and Google are also racing to launch similar technologies. Although AI agents are still in their early stages, market analysis firm Markets and Markets predicts that the value of the AI agent market will reach $47.1 billion by 2030.

While current AI agent technology remains basic, some experts have expressed concerns about its potential security risks. The data revealed by Blaho indicates that Operator performs well in certain security assessments, effectively responding to attempts to make the system execute "illegal activities" or search for "sensitive personal data." Security testing is considered one of the reasons for the lengthy development cycle of Operator.

OpenAI co-founder Wojciech Zaremba has criticized the lack of security measures in the agents released by Anthropic on social media, stating that if OpenAI were to release a similar product, it could provoke negative reactions.

Key Points:

🔍 The upcoming "Operator" tool from OpenAI can autonomously control computers to perform tasks such as coding and booking travel.

🛠️ According to leaked information, Operator has relatively low success rates on some tasks and performs worse than humans.

⚠️ Although Operator performs well in security assessments, experts are concerned about its potential security risks.