MiniCPM-V 2.6
High-performance multimodal language model suitable for image and video understanding.
CommonProductImageMultimodalImage Understanding
MiniCPM-V 2.6 is a multimodal large language model based on 800 million parameters, demonstrating leading performance in single image understanding, multiple image understanding, and video comprehension across various domains. The model achieved an average score of 65.2 on multiple popular benchmarks such as OpenCompass, surpassing widely used proprietary models. It possesses robust OCR capabilities, supports multiple languages, and performs efficiently, enabling real-time video understanding on devices like the iPad.
MiniCPM-V 2.6 Visit Over Time
Monthly Visits
19075321
Bounce Rate
45.07%
Page per Visit
5.5
Visit Duration
00:05:32