InternVL2_5-4B

A multimodal large language model that integrates visual and language understanding.

CommonProductImageMultimodalLarge Language Model
InternVL2_5-4B is an advanced multimodal large language model (MLLM) that maintains the core model architecture of InternVL 2.0 while significantly enhancing training and testing strategies and data quality. The model excels in handling tasks from image and text to text, particularly in multimodal reasoning, mathematical problem solving, OCR, and chart and document comprehension. As an open-source model, it provides researchers and developers with powerful tools to explore and build intelligent applications based on visual and linguistic elements.
Visit

InternVL2_5-4B Visit Over Time

Monthly Visits

20899836

Bounce Rate

46.04%

Page per Visit

5.2

Visit Duration

00:04:57

InternVL2_5-4B Visit Trend

InternVL2_5-4B Visit Geography

InternVL2_5-4B Traffic Sources

InternVL2_5-4B Alternatives