Phi-3-vision-128k-instruct
Microsoft's lightweight, advanced multimodal model focused on high-quality reasoning-intensive data for text and vision.
PremiumNewProductProductivityMultimodalHigh-Quality
Phi-3 Vision is a lightweight, state-of-the-art open multimodal model built on a dataset encompassing synthetic data and curated publicly available websites. It focuses on exceptionally high-quality reasoning-intensive data for both text and vision. Belonging to the Phi-3 family of models, the multimodal version supports a 128K context length (in tokens) and has undergone rigorous enhancement processes, combining supervised fine-tuning and direct preference optimization to ensure precise instruction following and robust safety measures.
Phi-3-vision-128k-instruct Visit Over Time
Monthly Visits
385115
Bounce Rate
42.09%
Page per Visit
6.9
Visit Duration
00:06:50