Anthropic Improves RAG Accuracy through a New 'Context Retrieval' Method

AIbase基地

Published inAI News · 4 min read · Oct 8, 2024

420

In the field of artificial intelligence, accurately retrieving information from knowledge databases has always been a significant challenge. Recently, AI company Anthropic announced the launch of a new method called "Contextual Retrieval," aimed at enhancing the precision of knowledge retrieval. This method improves the accuracy of AI systems by adding more contextual information during retrieval.

Anthropic, Claude

Existing Retrieval-Augmented Generation (RAG) systems typically index documents by splitting them into small chunks, which can overlook important contextual information. Anthropic's solution involves adding brief document summaries before each chunk, usually not exceeding 100 words. For example, the original text fragment "The company's revenue grew by 3% compared to the previous quarter" becomes, after contextual processing: "This fragment is from ACME Company's 2023 Q2 SEC filing; the previous quarter's revenue was $314 million, and the company's revenue grew by 3% compared to the previous quarter." Anthropic claims that this new method can reduce information retrieval errors by up to 49%. When combined with result re-ranking, the accuracy improvement can reach up to 67%.

More interestingly, Cornell University's research also supports this contextual retrieval approach. Researchers have proposed a similar technique called "Contextual Document Embedding" (CDE). Their method involves reorganizing training data so that each batch contains similar yet indistinguishable documents, prompting the model to learn finer distinctions. Additionally, the researchers developed a two-stage encoder that directly integrates information from adjacent documents into the embedding, allowing the model to consider relative word frequencies and other contextual cues.

In the "Massive Text Embedding Benchmark" (MTEB) tests, the CDE model achieved the best results in its size category. Experiments also showed that CDE excels in small, specific datasets in fields like finance or medicine, and performs well in tasks such as classification, clustering, and semantic similarity. However, researchers also noted that it remains unclear how CDE would affect large knowledge bases with billions of documents, and further research is needed on optimal context size and selection.

Key Points:
🌟 Anthropic's "Contextual Retrieval" method can reduce information retrieval error rates by up to 49% and can be combined with other technologies to further enhance accuracy.
📊 Cornell University's "Contextual Document Embedding" method demonstrates strong advantages in specific fields, effectively improving classification and clustering tasks.
🔍 Further research is needed to explore how these methods can be applied to large-scale knowledge bases and to find the best strategies for contextual processing.

Amazon Plans to Increase Investment in Anthropic and Build the World's Largest Data Center Together!

Amazon plans additional investment in AI startup Anthropic to strengthen their partnership. After investing $8B, the new round could make Amazon a major shareholder. They will collaborate on the world's largest data center project to provide computing power for Anthropic and sell its tech to AWS customers. Anthropic, founded by ex-OpenAI employees, competes with ChatGPT via its Claude model. Amazon also aims to invest $11B in Indiana data centers....

The Struggle Between AI and Copyright Law: Dual Rulings by Meta and Anthropic Reveal Legal Dilemmas

CA courts ruled AI training as 'fair use' in 2 cases within 48 hrs, but with differing legal reasoning. Anthropic case likened AI to human learning, while Meta emphasized differences. Both acknowledged data's creative value but simplified market harm analysis. Narrow rulings may evolve with new evidence, highlighting copyright law's AI adaptation challenges.....

Claude Code Upgraded Again! Hooks Feature Unlocks a New Dimension in AI Programming, Making Automation Smarter

With the deep application of artificial intelligence technology in the field of programming, Claude Code launched by Anthropic has become a reliable assistant for many developers, thanks to its powerful code comprehension and automation capabilities. Just yesterday, Claude Code received an important update, introducing the Hooks feature, which provides developers with more precise control and a more efficient development experience. What is the Hooks feature? The Hooks feature is a user-defined shell introduced by Claude Code.

Cursor Boldly Poaches! Core Personnel from Claude Code Join Competitor

As competition in the AI industry intensifies, a notable "poaching" incident has recently occurred. The developer of the popular coding application Cursor, Anysphere, successfully poached two key personnel from Anthropic: Boris Cherny, the lead developer of the Claude Code project, and Cat Wu, the product manager. This move has not only surprised industry insiders but also raised questions about the relationship between Anthropic and Cursor.

Meta May Deprecate Its Own Llama AI and Turn to Competitors

Recently, Meta Platforms is facing a major decision, possibly abandoning its self-developed Llama AI model and adopting artificial intelligence systems from competitors such as OpenAI and Anthropic. This change reflects a significant adjustment in Meta's open-source AI strategy and also shows the company's dissatisfaction with its own product performance. The turning point occurred at the Llama 4 launch event in April, where the product was presented at Meta's Llama

Apple May Abandon In-House Development and Seek Help from OpenAI and Anthropic to Upgrade Siri

Recently, it was reported that Apple is in talks with OpenAI and Anthropic, planning to adopt their artificial intelligence technology in the upcoming new version of Siri. This shift indicates that Apple may seek external help in the AI field, significantly altering its long-standing strategy of relying on in-house development. Currently, Apple's AI features are mainly based on its self-developed "Apple Foundation Model," which is expected to launch a new voice assistant in 2026. However, if Apple adopts third-party technology...

Product Finder

Product Submit

AI Models Finder

MCP Servers

MCP Client

MCP Inspector

Case Tutorials

Latest AI News

AI Daily Brief

Anthropic Improves RAG Accuracy through a New 'Context Retrieval' Method

AIbase基地

This article is from AIbase Daily

AI News Recommendations

Amazon AWS to Launch AI Agent Marketplace, Anthropic Becomes a Key Partner

Amazon Plans to Increase Investment in Anthropic and Build the World's Largest Data Center Together!

Microsoft, OpenAI, and Anthropic Launch AI Training Center for Educators

The Struggle Between AI and Copyright Law: Dual Rulings by Meta and Anthropic Reveal Legal Dilemmas

Claude Code Upgraded Again! Hooks Feature Unlocks a New Dimension in AI Programming, Making Automation Smarter

Cursor Boldly Poaches! Core Personnel from Claude Code Join Competitor

Anthropic's Annual Revenue Has Reached $4 Billion, Growing Nearly Fourfold from the Start of the Year, Intensifying Competition with Cursor

Meta May Deprecate Its Own Llama AI and Turn to Competitors

Apple May Abandon In-House Development and Seek Help from OpenAI and Anthropic to Upgrade Siri

Shocking Truth! Anthropic Destroyed Millions of Books to Train AI, Copyright Dispute Escalates!