CogAgent
An open-source end-to-end visual language model (VLM) based GUI agent
CommonProductProgrammingVisual Language ModelGUI Agent
CogAgent is a GUI agent based on visual language models (VLM) that facilitates bilingual (Chinese and English) cloud interaction through screenshots and natural language. CogAgent has made significant advancements in GUI perception, inference prediction accuracy, operational space integrity, and task generalization. The model has been applied in ZhipuAI's GLM-PC product, with the aim of aiding researchers and developers in advancing the research and application of GUI agents based on visual language models.
CogAgent Visit Over Time
Monthly Visits
494758773
Bounce Rate
37.69%
Page per Visit
5.7
Visit Duration
00:06:29