CogVLM
A powerful open-source visual language model
CommonProductImagevisual language modelimage description
CogVLM is a powerful open-source visual language model. CogVLM-17B has 10 billion visual parameters and 7 billion language parameters. CogVLM-17B achieves state-of-the-art performance on 10 classic cross-modal benchmark datasets, including NoCaps, Flicker30k Captions, RefCOCO, RefCOCO+, RefCOCOg, Visual7W, GQA, ScienceQA, VizWiz VQA, and TDIUC, and ranks second or matches PaLI-X 55B on VQAv2, OKVQA, TextVQA, and COCO Captions. CogVLM can also engage in conversations with you about images.
CogVLM Visit Over Time
Monthly Visits
494758773
Bounce Rate
37.69%
Page per Visit
5.7
Visit Duration
00:06:29