CogVLM

A powerful open-source visual language model

CommonProductImagevisual language modelimage description
CogVLM is a powerful open-source visual language model. CogVLM-17B has 10 billion visual parameters and 7 billion language parameters. CogVLM-17B achieves state-of-the-art performance on 10 classic cross-modal benchmark datasets, including NoCaps, Flicker30k Captions, RefCOCO, RefCOCO+, RefCOCOg, Visual7W, GQA, ScienceQA, VizWiz VQA, and TDIUC, and ranks second or matches PaLI-X 55B on VQAv2, OKVQA, TextVQA, and COCO Captions. CogVLM can also engage in conversations with you about images.
Visit

CogVLM Visit Over Time

Monthly Visits

503747431

Bounce Rate

37.31%

Page per Visit

5.7

Visit Duration

00:06:44

CogVLM Visit Trend

CogVLM Visit Geography

CogVLM Traffic Sources

CogVLM Alternatives