ml-ferret
End-to-end MLLM, enabling precise referencing and localization.
CommonProductProgrammingMachine LearningLanguage Model
ml-ferret is an end-to-end machine learning language model (MLLM) that can accept various forms of references and respond with precise localization in multimodal environments. It combines mixed regional representations and spatially aware visual samplers, supporting fine-grained and open-vocabulary referencing and localization. Additionally, ml-ferret includes the GRIT dataset (approximately 1.1 million samples) and the Ferret-Bench evaluation benchmark.
ml-ferret Visit Over Time
Monthly Visits
488643166
Bounce Rate
37.28%
Page per Visit
5.7
Visit Duration
00:06:37