Phi-3-vision-128k-instruct

Microsoft's lightweight, advanced multimodal model focused on high-quality reasoning-intensive data for text and vision.

PremiumNewProductProductivityMultimodalHigh-Quality
Phi-3 Vision is a lightweight, state-of-the-art open multimodal model built on a dataset encompassing synthetic data and curated publicly available websites. It focuses on exceptionally high-quality reasoning-intensive data for both text and vision. Belonging to the Phi-3 family of models, the multimodal version supports a 128K context length (in tokens) and has undergone rigorous enhancement processes, combining supervised fine-tuning and direct preference optimization to ensure precise instruction following and robust safety measures.
Visit

Phi-3-vision-128k-instruct Visit Over Time

Monthly Visits

412492

Bounce Rate

40.71%

Page per Visit

6.4

Visit Duration

00:06:49

Phi-3-vision-128k-instruct Visit Trend

Phi-3-vision-128k-instruct Visit Geography

Phi-3-vision-128k-instruct Traffic Sources

Phi-3-vision-128k-instruct Alternatives