InternVL2_5-1B-MPO

A multimodal large language model that enhances integrated understanding of visual and language data.

CommonProductProductivityMultimodalLarge Language Model
InternVL2_5-1B-MPO is a multimodal large language model (MLLM) built on InternVL2.5 and Mixed Preference Optimization (MPO), showcasing superior overall performance. This model integrates incrementally pre-trained InternViT with various pre-trained large language models (LLMs), including InternLM 2.5 and Qwen 2.5, utilizing a randomly initialized MLP projector. InternVL2.5-MPO retains the ‘ViT-MLP-LLM’ paradigm from InternVL 2.5 and its predecessors while introducing support for multiple images and video data. The model excels in multimodal tasks, capable of handling a variety of visual-language tasks including image captioning and visual question answering.
Visit

InternVL2_5-1B-MPO Visit Over Time

Monthly Visits

20899836

Bounce Rate

46.04%

Page per Visit

5.2

Visit Duration

00:04:57

InternVL2_5-1B-MPO Visit Trend

InternVL2_5-1B-MPO Visit Geography

InternVL2_5-1B-MPO Traffic Sources

InternVL2_5-1B-MPO Alternatives