Florence-2-large

An advanced vision foundation model that supports various visual and visual-language tasks

CommonProductImageVisual ModelMulti-task Learning
Florence-2-large, developed by Microsoft, is an advanced vision foundation model that uses a prompt-based approach to handle a wide range of visual and visual-language tasks. The model can interpret simple text prompts to perform tasks such as image description, object detection, and segmentation. It is trained on the FLD-5B dataset, which contains 540 million images with 5.4 billion annotations, making it proficient in multi-task learning. Its sequence-to-sequence architecture enables it to perform well in both zero-shot and fine-tuning settings, proving to be a competitive vision foundation model.
Visit

Florence-2-large Visit Over Time

Monthly Visits

17788201

Bounce Rate

44.87%

Page per Visit

5.4

Visit Duration

00:05:32

Florence-2-large Visit Trend

Florence-2-large Visit Geography

Florence-2-large Traffic Sources

Florence-2-large Alternatives