Research Finds: Number of Documents in RAG Systems Impacts Language Model Performance

AIbase基地

Published inAI News · 3 min read · Mar 31, 2025

Researchers at the Hebrew University of Jerusalem recently discovered that in Retrieval Augmented Generation (RAG) systems, the number of documents processed significantly impacts language model performance, even when the total text length remains constant.

The research team conducted experiments using 2,417 questions from the MuSiQue validation dataset, each linked to 20 Wikipedia paragraphs. Two to four paragraphs contained relevant answer information, while the rest served as distractors. To study the impact of the number of documents, the team created multiple data partitions, gradually reducing the number of documents from 20 to as few as 2-4 containing relevant information. To ensure consistent total token count, researchers extended the retained documents using text from the original Wikipedia articles.

Results showed that in most cases, reducing the number of documents improved language model performance by approximately 10%. The study tested several open-source models, including Llama-3.1, Qwen2, and Gemma2. Notably, the Qwen2 model showed an exception, maintaining relatively stable performance with varying document numbers, while Llama-3.1 and Gemma-2's performance declined significantly with increasing document numbers.

When only documents containing supporting information were provided, all models showed a significant performance boost. This suggests that similar but irrelevant documents, common in RAG systems, confuse the models and reduce performance. Interestingly, models performed better with clearly irrelevant, random documents, indicating they are better at identifying and filtering out obviously unrelated content.

The researchers emphasize the need to balance relevance and diversity when designing retrieval systems to mitigate information conflicts. They also acknowledge some limitations of the study, including the lack of analysis on the impact of prompt variations and data order. The team has publicly released the dataset to facilitate further research in this area.

Retrieval Augmented Generation (RAG)Llama-3.1 Qwen2 Gemma2

This article is from AIbase Daily

Welcome to the [AI Daily] column! This is your daily guide to exploring the world of artificial intelligence. Every day, we present you with hot topics in the AI field, focusing on developers, helping you understand technical trends, and learning about innovative AI product applications.

—— Created by the AIbase Daily Team

AI News Recommendations

AI Daily: Alibaba's Qwen Tops Global Open-Source Model Ranking; MiniMax Launches Speech-02; ChatGPT Paid Users Surge to 20 Million

Welcome to the 【AI Daily】column! Your daily guide to exploring the world of artificial intelligence. We present you with the hottest AI news, focusing on developers and helping you understand technology trends and innovative AI product applications. Discover new AI products: https://top.aibase.com/ 1. Alibaba's Qwen-2.5-Omni Tops Global Open-Source Model Ranking On April 2nd, 2024, Hugging Face released its latest large model ranking, with Alibaba's Qwen...

Apr 2, 2025

340

Alibaba's Qwen-2.5-Omni Tops Global Open-Source Model Leaderboard

Apr 2, 2025

3.2k

AI Daily: Taobao Launches AI Fight Against Fake Images; OpenAI Announces Support for MCP Protocol; Alibaba Open-Sources Multimodal Model Qwen2.5-Omni

Welcome to the "AI Daily" column! Your daily guide to exploring the world of artificial intelligence. We present the hottest AI news, focusing on developers and helping you understand technology trends and innovative AI applications. Discover new AI products here: https://top.aibase.com/ 1、 Alibaba's Tongyi Qianwen open-sources the new generation end-to-end multimodal model Qwen2.5-Omni. The Alibaba Cloud Tongyi Qianwen team has launched Qwen2.5-Omni, a new generation multimodal...

Mar 27, 2025

Alibaba Unveils its First Multimodal Large Model, Qwen2.5-Omni, Challenging Global Tech Giants

On March 27th, Alibaba launched its first multimodal large model, Qwen2.5-Omni-7B. This model boasts powerful capabilities, handling various input modalities such as text, images, audio, and video, and generating text and natural speech outputs in real-time. This innovative technological breakthrough marks another significant advancement for Alibaba in the field of artificial intelligence. In the authoritative OmniBench multimodal fusion task benchmark, Qwen2.5-Omni achieved...

Mar 27, 2025

1.1k

Alibaba Releases Qwen2.5-Omni, a New Generation of End-to-End Multimodal Model

The Alibaba Cloud Tongyi Qianwen Qwen team announced the launch of Qwen2.5-Omni, a new generation of end-to-end multimodal flagship model in the Qwen family. Designed for comprehensive multimodal understanding, this new model seamlessly handles various input formats including text, images, audio, and video, and generates text and natural speech synthesis outputs simultaneously via real-time streaming response.

Mar 27, 2025

580

Alibaba Unveils Qwen2.5-VL-32B: A New Multimodal Model Combining Vision, Language, and Mathematical Reasoning

Alibaba is making waves in the AI field with the recent open-source release of its latest multimodal model, Qwen2.5-VL-32B-Instruct. This model is part of the Qwen2.5 series, which also includes 3B, 7B, and 72B versions. The 32B version prioritizes convenient local execution while maintaining performance. Enhanced through reinforcement learning, Qwen2.5-VL-32B excels in several areas. Notably, its responses are more aligned with human expectations.

Mar 25, 2025

570

Fin-R1: A 7B-Parameter Financial Large Language Model Trained with Reinforcement Learning, Outperforming Industry Giants Based on Qwen2.5-7B

A powerful newcomer has emerged in the fintech arena. The Fin-R1 model, jointly developed by Professor Liwen Zhang's team (SUFE-AIFLM-Lab) at the School of Statistics and Data Science, Shanghai University of Finance and Economics, and Caiyue Xingchen, has been officially open-sourced, attracting significant attention due to its impressive performance. This financial specialized large language model, based on Qwen2.5-7B and trained with reinforcement learning, achieves leading performance across multiple financial benchmark tests. Remarkably, Fin-R1 surpasses most models of comparable size, and even many significantly larger models, despite having only 7B parameters.

Mar 24, 2025

320

AMD Launches Open-Source GAIA Project for Efficient Local LLM Execution

AMD recently announced GAIA, an open-source application designed to provide a highly efficient and localized method for running Large Language Models (LLMs). Currently supporting Windows and optimized for Ryzen AI 300 series processors, GAIA leverages the strengths of these processors for AI tasks. GAIA is a generative AI application enabling private LLM execution on personal computers, ensuring data privacy. Furthermore, GAIA utilizes...

Mar 24, 2025

160

Spark-TTS: AI-Powered Voice Cloning and Customization!

Mar 7, 2025

2.7k

Spark-TTS: A Text-to-Speech System Supporting Zero-Shot Voice Cloning and Fine-grained Control

Mar 6, 2025

900

AI News

AI Daily

AI Timeline

Latest Cases

Image Collection

Video Collection

Audio Collection

Content Collection

Latest Tutorials

AI Product Ranking

AI Traffic Growth Ranking

AI Traffic Decline Ranking

AI Weekly Ranking

United States

China

India

Brazil

Image Generation

Personal Assistant

Character Generation

Video Generation

AI Project Ranking

AI Project Growth Ranking

AI Developer Ranking

AI Organization Ranking

Deepseek

TTS

LLM

ChatGPT

Overview

Research Finds: Number of Documents in RAG Systems Impacts Language Model Performance

AIbase基地

This article is from AIbase Daily

AI News Recommendations

AI Daily: Alibaba's Qwen Tops Global Open-Source Model Ranking; MiniMax Launches Speech-02; ChatGPT Paid Users Surge to 20 Million

Alibaba's Qwen-2.5-Omni Tops Global Open-Source Model Leaderboard

AI Daily: Taobao Launches AI Fight Against Fake Images; OpenAI Announces Support for MCP Protocol; Alibaba Open-Sources Multimodal Model Qwen2.5-Omni

Alibaba Unveils its First Multimodal Large Model, Qwen2.5-Omni, Challenging Global Tech Giants

Alibaba Releases Qwen2.5-Omni, a New Generation of End-to-End Multimodal Model

Alibaba Unveils Qwen2.5-VL-32B: A New Multimodal Model Combining Vision, Language, and Mathematical Reasoning

Fin-R1: A 7B-Parameter Financial Large Language Model Trained with Reinforcement Learning, Outperforming Industry Giants Based on Qwen2.5-7B

AMD Launches Open-Source GAIA Project for Efficient Local LLM Execution

Spark-TTS: AI-Powered Voice Cloning and Customization!

Spark-TTS: A Text-to-Speech System Supporting Zero-Shot Voice Cloning and Fine-grained Control