AIbase
Product LibraryTool Navigation

Multimodal-Document-Analysis-and-Query-Retrieval

Public

This project performs multimodal document analysis and query retrieval by downloading PDFs, converting pages to images, indexing them for semantic search, and analyzing retrieved images using visual-language models like Qwen2VL and Blip2.

Creat2025-01-11T21:00:16
Update2025-01-22T06:22:12
1
Stars
0
Stars Increase

Related projects