On January 23, 2025, the world's first public-ready computer intelligent agent, GLM-PC, was upgraded again, attracting widespread attention. GLM-PC is based on the multimodal large model CogAgent and can "observe" and "operate" computers like a human, assisting users in efficiently completing various computer tasks.

Since the release of GLM-PC v1.0 and the opening of internal testing on November 29, 2024, this intelligent agent has been continuously optimized and upgraded. The newly launched "Deep Thinking" mode adds logical reasoning and code generation capabilities, making GLM-PC more adept at handling complex tasks. Additionally, GLM-PC has achieved support for the Windows operating system, further expanding its application scenarios.

WeChat Screenshot_20250123140726.png

The architectural design of GLM-PC fully considers the enhancement of the agent's model capabilities and the optimization of its collaborative structure. A complete agent must meet the requirements of receiving diverse signals such as text, images, videos, and audio at the perceptual level; possess logical thinking, task planning capabilities, efficient perception, and flexible operation at the cognitive level; and be able to perform full GUI space operations, receive environmental feedback, and self-correct at the execution level. GLM-PC draws on the division of labor between the human "left brain" and "right brain," achieving a deep integration of logical reasoning and perceptual cognition through code generation and graphical interface understanding, giving it the ability to balance logic and creativity.

In practical applications, GLM-PC demonstrates powerful functionality. Its "left brain" is responsible for rigorous logical reasoning and task execution, quickly formulating detailed task planning schemes and ensuring precise execution and high automation through a cyclic execution mechanism. For example, GLM-PC can automatically extract product data from images, store it in Excel, and automatically add products to the Taobao shopping cart, achieving a one-stop shopping process. Its "right brain," on the other hand, focuses on deep perception and interactive experience, accurately identifying graphical interface elements, understanding user behavior, performing in-depth semantic analysis of complex images, and integrating image and text information to form comprehensive perceptual results. This enables GLM-PC to excel in data organization and information extraction.

It is worth mentioning that GLM-PC also showcases the powerful ability of left and right brain collaboration. While handling complex logical tasks, GLM-PC can also demonstrate higher adaptability, creativity, and generalization capabilities in open-ended problems. Through dynamic optimization and situational awareness, GLM-PC can help users explore more efficient solutions, particularly in areas such as cyclical task processing, multi-step reasoning execution, and long-chain task management.

In addition, CogAgent is actively promoting collaboration between GLM-PC and well-known PC manufacturers to jointly advance the innovation and development of AIPC (AI Personal Computer). AIPC is not just a computer; it is a new application of AI agents in the personal computing field, capable of providing users with a more efficient and intelligent working and living experience.

Download & Experience:https://cogagent.aminer.cn