RAG-Retrieval is a full-stack RAG refinement and inference framework that supports the inference of various RAG Reranker models, including vector models, delayed interactive models, and interactive models. It provides a lightweight Python library that allows users to call different RAG sorting models in a unified manner, simplifying the use and deployment of sorting models.