vdr-2b-multi-v1
A multilingual embedding model for visual document retrieval.
CommonProductImageMultilingualVisual Document Retrieval
vdr-2b-multi-v1 is a multilingual embedding model launched by Hugging Face, specifically designed for visual document retrieval. This model encodes document page screenshots into dense vector representations, allowing for the search and query of visually rich multilingual documents without the need for OCR or data extraction processes. Developed based on the MrLight/dse-qwen2-2b-mrl-v1 model, it has been trained on a self-constructed multilingual query-image pair dataset, making it an upgraded version of mcdse-2b-v1 with enhanced performance. The model supports Italian, Spanish, English, French, and German and includes a high-quality open-source multilingual synthetic training dataset with 500,000 samples, characterized by low VRAM usage and fast inference capabilities, demonstrating excellent performance in cross-language retrieval.
vdr-2b-multi-v1 Visit Over Time
Monthly Visits
21315886
Bounce Rate
45.50%
Page per Visit
5.2
Visit Duration
00:05:02