Multimodal-Document-Analysis-and-Query-Retrieval
PublicThis project performs multimodal document analysis and query retrieval by downloading PDFs, converting pages to images, indexing them for semantic search, and analyzing retrieved images using visual-language models like Qwen2VL and Blip2.