InternVL2_5-4B
A multimodal large language model that integrates visual and language understanding.
CommonProductImageMultimodalLarge Language Model
InternVL2_5-4B is an advanced multimodal large language model (MLLM) that maintains the core model architecture of InternVL 2.0 while significantly enhancing training and testing strategies and data quality. The model excels in handling tasks from image and text to text, particularly in multimodal reasoning, mathematical problem solving, OCR, and chart and document comprehension. As an open-source model, it provides researchers and developers with powerful tools to explore and build intelligent applications based on visual and linguistic elements.
InternVL2_5-4B Visit Over Time
Monthly Visits
20899836
Bounce Rate
46.04%
Page per Visit
5.2
Visit Duration
00:04:57