Florence-2-large-ft
An advanced vision foundation model that supports a variety of visual and vision-language tasks.
CommonProductImageImage ProcessingNatural Language Processing
Florence-2-large-ft, developed by Microsoft, is a high-performance vision foundation model that utilizes a prompt-based approach to handle a wide range of visual and vision-language tasks. This model can perform tasks such as image description, object detection, and segmentation through simple text prompts. It leverages the FLD-5B dataset, which contains 5.4 billion annotations across 126 million images, enabling multi-task learning. The model's sequence-to-sequence architecture demonstrates its strong performance in both zero-shot and fine-tuning settings, establishing it as a competitive vision foundation model.
Florence-2-large-ft Visit Over Time
Monthly Visits
19075321
Bounce Rate
45.07%
Page per Visit
5.5
Visit Duration
00:05:32