MMStar

An elite benchmark dataset for evaluating large visual language models

CommonProductProductivityVisual Language ModelsBenchmark
MMStar is a benchmark dataset designed to assess the multimodal capabilities of large visual language models. It comprises 1500 carefully selected visual language samples, covering 6 core abilities and 18 sub-dimensions. Each sample has undergone human review, ensuring visual dependency, minimizing data leakage, and requiring advanced multimodal capabilities for resolution. In addition to traditional accuracy metrics, MMStar proposes two new metrics to measure data leakage and the practical performance gains of multimodal training. Researchers can use MMStar to evaluate the multimodal capabilities of visual language models across multiple tasks and leverage the new metrics to discover potential issues within models.
Visit

MMStar Visit Over Time

Monthly Visits

732

Bounce Rate

43.44%

Page per Visit

1.0

Visit Duration

00:00:00

MMStar Visit Trend

MMStar Visit Geography

MMStar Traffic Sources

MMStar Alternatives