InternVL2_5-1B-MPO
A multimodal large language model that enhances integrated understanding of visual and language data.
CommonProductProductivityMultimodalLarge Language Model
InternVL2_5-1B-MPO is a multimodal large language model (MLLM) built on InternVL2.5 and Mixed Preference Optimization (MPO), showcasing superior overall performance. This model integrates incrementally pre-trained InternViT with various pre-trained large language models (LLMs), including InternLM 2.5 and Qwen 2.5, utilizing a randomly initialized MLP projector. InternVL2.5-MPO retains the ‘ViT-MLP-LLM’ paradigm from InternVL 2.5 and its predecessors while introducing support for multiple images and video data. The model excels in multimodal tasks, capable of handling a variety of visual-language tasks including image captioning and visual question answering.
InternVL2_5-1B-MPO Visit Over Time
Monthly Visits
20899836
Bounce Rate
46.04%
Page per Visit
5.2
Visit Duration
00:04:57