MMStar
An elite benchmark dataset for evaluating large visual language models
CommonProductProductivityVisual Language ModelsBenchmark
MMStar is a benchmark dataset designed to assess the multimodal capabilities of large visual language models. It comprises 1500 carefully selected visual language samples, covering 6 core abilities and 18 sub-dimensions. Each sample has undergone human review, ensuring visual dependency, minimizing data leakage, and requiring advanced multimodal capabilities for resolution. In addition to traditional accuracy metrics, MMStar proposes two new metrics to measure data leakage and the practical performance gains of multimodal training. Researchers can use MMStar to evaluate the multimodal capabilities of visual language models across multiple tasks and leverage the new metrics to discover potential issues within models.
MMStar Visit Over Time
Monthly Visits
732
Bounce Rate
43.44%
Page per Visit
1.0
Visit Duration
00:00:00