InternViT-6B-448px-V2_5

An enhanced visual model based on InternViT-6B-448px-V1-5

CommonProductImageVisual ModelFeature Extraction
InternViT-6B-448px-V2_5 is a visual model built upon InternViT-6B-448px-V1-5, which enhances the visual encoder's ability to extract visual features by utilizing ViT incremental learning and NTP loss (Stage 1.5). It particularly excels in domains where representation is lacking in large-scale network datasets, such as multilingual OCR data and mathematical charts. This model is part of the InternVL 2.5 series, maintaining the same 'ViT-MLP-LLM' architecture as its predecessor, while integrating the newly incrementally pretrained InternViT alongside various pretrained LLMs, including InternLM 2.5 and Qwen 2.5, utilizing a randomly initialized MLP projector.
Visit

InternViT-6B-448px-V2_5 Visit Over Time

Monthly Visits

20899836

Bounce Rate

46.04%

Page per Visit

5.2

Visit Duration

00:04:57

InternViT-6B-448px-V2_5 Visit Trend

InternViT-6B-448px-V2_5 Visit Geography

InternViT-6B-448px-V2_5 Traffic Sources

InternViT-6B-448px-V2_5 Alternatives