AIbase
Product LibraryTool Navigation

VisualLanguageModel

Public

A custom Vision-Language Model (VLM) built from scratch, using SigLip for contrastive learning and a ViT-based encoder to generate meaningful image captions and semantic descriptions.

Creat2025-03-26T14:32:59
Update2025-04-06T02:29:57
0
Stars
0
Stars Increase