DeepSeek-VL2
An advanced multimodal understanding model that integrates visual and linguistic capabilities.
CommonProductImage\Visual Language Models\\Multimodal Understanding\
DeepSeek-VL2 is a series of large Mixture-of-Experts visual language models, showing significant improvements over its predecessor, DeepSeek-VL. This series exhibits exceptional performance in tasks such as visual question answering, optical character recognition, document/table/chart understanding, and visual localization. DeepSeek-VL2 includes three variants: DeepSeek-VL2-Tiny, DeepSeek-VL2-Small, and DeepSeek-VL2, with 1.0B, 2.8B, and 4.5B active parameters, respectively. Compared to existing open-source dense and MoE base models with similar or fewer active parameters, DeepSeek-VL2 achieves competitive or state-of-the-art performance.
DeepSeek-VL2 Visit Over Time
Monthly Visits
502571820
Bounce Rate
37.10%
Page per Visit
5.9
Visit Duration
00:06:29