VisRAG
A retrieval-augmented generation model based on visual language modeling.
CommonProductImageVisual Language ModelRetrieval-Augmented Generation
VisRAG is an innovative retrieval-augmented generation (RAG) process based on visual language models (VLMs). Unlike traditional text-based RAG, VisRAG embeds documents directly as images through a VLM, which enhances the generative capabilities of the VLM. This method maximizes the retention of data information from the original documents, eliminating the information loss introduced during parsing. The application of the VisRAG model on multimodal documents demonstrates its strong potential in information retrieval and enhanced text generation.
VisRAG Visit Over Time
Monthly Visits
494758773
Bounce Rate
37.69%
Page per Visit
5.7
Visit Duration
00:06:29