Google Releases Lightweight PaLI-3 Visual Language Model Achieving SOTA Performance
站长之家
45
Google has released PaLI-3, a compact visual-language model that achieves state-of-the-art (SOTA) performance. Utilizing contrastive pre-training methods, it delves into the potential of Vision-and-Language (VIT) models, reaching SOTA levels in multi-lingual modal retrieval. PaLI-3 integrates natural language understanding with image recognition, becoming a significant force in AI innovation. The contrastive pre-training approach based on SigLIP ushers in a new era of multi-lingual cross-modal retrieval. Although not fully open-sourced yet, it offers multi-lingual and English SigLIP models, providing researchers with opportunities to experiment.
© Copyright AIbase Base 2024, Click to View Source - https://www.aibase.com/news/2554