Florence-2-base-ft

An advanced visual foundation model supporting various visual and vision-language tasks

CommonProductImageImage ProcessingVision-Language Model
Florence-2 is a high-performance visual foundation model developed by Microsoft, utilizing a prompt-based approach to handle a wide range of visual and vision-language tasks. This model can interpret simple text prompts and perform tasks such as image description, object detection, and segmentation. It is trained on the FLD-5B dataset, containing 5.4 billion annotations across 126 million images, demonstrating expertise in multi-task learning. Its sequence-to-sequence architecture allows for strong performance in both zero-shot and fine-tuning settings, proving to be a competitive visual foundation model.
Visit

Florence-2-base-ft Visit Over Time

Monthly Visits

17104189

Bounce Rate

44.67%

Page per Visit

5.5

Visit Duration

00:05:49

Florence-2-base-ft Visit Trend

Florence-2-base-ft Visit Geography

Florence-2-base-ft Traffic Sources

Florence-2-base-ft Alternatives