Data to be translated: Tsinghua University, Zhejiang University, and other prestigious institutions have driven the development of open-source alternatives to GPT-4V, leading to a series of high-performance open-source visual models in China. Among these, LLaVA, CogAgent, and BakLLaVA have garnered significant attention. LLaVA demonstrates capabilities close to GPT-4 in visual chatting and reasoning question-answering, while CogAgent is an improved open-source visual-language model based on CogVLM. Additionally, BakLLaVA is a Mistral7B foundational model enhanced with the LLaVA1.5 architecture, offering superior performance and commercial viability. These open-source visual models hold immense potential in the field of visual processing.