LLaVA-3b

LLaVA-3b is a model fine-tuned based on Dolphin 2.6 Phi, using the SigLIP 400M visual tower in an LLaVA manner. The model features multiple image labels and outputs from the latest layer of the visual encoder.

CommonProductProgrammingHugging FaceLLaVA-3b
LLaVA-3b is a model fine-tuned based on Dolphin 2.6 Phi, using the SigLIP 400M visual tower in an LLaVA manner. The model features multiple image labels and outputs from the latest layer of the visual encoder. This model is based on Phi-2 and is subject to the Microsoft Research license, prohibiting commercial use. Thanks to ML Collective for providing computational resource credits.
Visit

LLaVA-3b Visit Over Time

Monthly Visits

20899836

Bounce Rate

46.04%

Page per Visit

5.2

Visit Duration

00:04:57

LLaVA-3b Visit Trend

LLaVA-3b Visit Geography

LLaVA-3b Traffic Sources

LLaVA-3b Alternatives