cogagent-9b-20241220
CogAgent-9B-20241220 is a GUI agent model based on visual language models.
CommonProductProgrammingvisual language modelGUI agent
The CogAgent-9B-20241220 model is developed on the GLM-4V-9B bilingual open-source visual language model. Through data collection and optimization, multi-stage training, and strategy improvements, it has made significant advancements in GUI perception, inference prediction accuracy, action space completeness, and task generalization capabilities. This model supports bilingual interaction (Chinese and English) and can handle screenshots and language input. The current version has been implemented in ZhipuAI's GLM-PC product, aimed at helping researchers and developers progress in the study and application of visual language model-based GUI agents.
cogagent-9b-20241220 Visit Over Time
Monthly Visits
20899836
Bounce Rate
46.04%
Page per Visit
5.2
Visit Duration
00:04:57