LLaVA-3b
LLaVA-3b is a model fine-tuned based on Dolphin 2.6 Phi, using the SigLIP 400M visual tower in an LLaVA manner. The model features multiple image labels and outputs from the latest layer of the visual encoder.
CommonProductProgrammingHugging FaceLLaVA-3b
LLaVA-3b is a model fine-tuned based on Dolphin 2.6 Phi, using the SigLIP 400M visual tower in an LLaVA manner. The model features multiple image labels and outputs from the latest layer of the visual encoder. This model is based on Phi-2 and is subject to the Microsoft Research license, prohibiting commercial use. Thanks to ML Collective for providing computational resource credits.
LLaVA-3b Visit Over Time
Monthly Visits
19075321
Bounce Rate
45.07%
Page per Visit
5.5
Visit Duration
00:05:32