The CogAgent-9B-20241220 model is developed on the GLM-4V-9B bilingual open-source visual language model. Through data collection and optimization, multi-stage training, and strategy improvements, it has made significant advancements in GUI perception, inference prediction accuracy, action space completeness, and task generalization capabilities. This model supports bilingual interaction (Chinese and English) and can handle screenshots and language input. The current version has been implemented in ZhipuAI's GLM-PC product, aimed at helping researchers and developers progress in the study and application of visual language model-based GUI agents.