blip-vision-language
PublicBLIP is a novel Vision-Language Pre-training (VLP) framework designed to excel in both understanding and generation tasks by salesforce. Unlike existing models, BLIP leverages noisy web data through bootstrapping, generating synthetic captions and effectively filtering out noise for enhanced performance.