MiniCPM-o

MiniCPM-o 2.6: An MLLM capable of delivering visual, voice, and multimodal interactions at GPT-4o level on mobile devices.

PremiumNewProductOthersMultimodalLanguage Model
MiniCPM-o 2.6 is the latest multimodal large language model (MLLM) developed by the OpenBMB team, featuring 8 billion parameters and capable of high-quality visual, voice, and multimodal interactions on edge devices like smartphones. This model is built on SigLip-400M, Whisper-medium-300M, ChatTTS-200M, and Qwen2.5-7B, trained in an end-to-end manner, and performs comparably to GPT-4o-202405. Its main advantages include leading visual capabilities, advanced voice functionality, powerful multimodal streaming abilities, impressive OCR performance, and superior efficiency. The model is open-source and free to use for academic research and commercial purposes.
Visit

MiniCPM-o Visit Over Time

Monthly Visits

490881889

Bounce Rate

37.92%

Page per Visit

5.6

Visit Duration

00:06:18

MiniCPM-o Visit Trend

MiniCPM-o Visit Geography

MiniCPM-o Traffic Sources

MiniCPM-o Alternatives