Supercharging Llama 3: NVIDIA Unveils New Tuning Framework, RankRAG, Surpassing GPT-4!

AIbase

Published inAI News · 5 min read · Jul 10, 2024

143

Recently, two Chinese scholars from the Georgia Institute of Technology and NVIDIA have proposed a new fine-tuning framework called RankRAG. This framework greatly simplifies the complex RAG pipeline, enabling the same Large Language Model (LLM) to perform retrieval, ranking, and generation tasks, resulting in a significant performance improvement.

QQ截图20240710105156.jpg

RAG (Retrieval-Augmented Generation) is a commonly used technique in LLM deployment, especially suitable for text generation tasks that require a large amount of factual knowledge. Typically, the RAG process involves: a dense model based on text encoding retrieves the top-k text segments from an external database, followed by the LLM's reading and generation. Although this process is widely used, it has limitations, such as the selection of k-value. If k-value is too large, even LLMs supporting long contexts struggle to process quickly; if it's too small, it requires a high recall retrieval mechanism, and the existing retrieval and ranking models have their own shortcomings.

Based on these issues, the RankRAG framework proposes a new approach: through fine-tuning, it extends the LLM's capabilities, allowing the LLM to perform retrieval and ranking itself. Experimental results show that this method not only improves data efficiency but also significantly enhances model performance. Particularly, the RankRAG fine-tuned Llama38B/70B models on multiple general benchmarks and biomedical knowledge-intensive benchmarks have outperformed the ChatQA-1.58B and ChatQA-1.570B models, respectively.

QQ截图20240710105208.jpg

The key to RankRAG lies in its high interactivity and editability. Users can not only view the AI-generated content in real-time but also directly edit and iterate on the interface. This immediate feedback mechanism greatly improves work efficiency, making AI a powerful assistant in the creative process. More excitingly, this update allows these Artifacts to no longer be limited to the Claude platform, and users can easily share them anywhere.

This innovation in the RankRAG fine-tuning framework also includes two stages of instruction fine-tuning. The first stage is supervised fine-tuning (SFT), which mixes multiple datasets to improve the LLM's ability to follow instructions. The second stage's fine-tuning dataset contains various QA data, retrieval-enhanced QA data, and contextual ranking data, further enhancing the LLM's retrieval and ranking capabilities.

In experiments, RankRAG consistently outperformed the current open-source SOTA model ChatQA-1.5 on nine general domain datasets. Especially in challenging QA tasks such as long-tail QA and multi-hop QA, RankRAG improved performance by over 10% compared to ChatQA-1.5.

In summary, RankRAG not only excels in retrieval and generation tasks but also demonstrates strong adaptability on the biomedical RAG benchmark Mirage. Even without fine-tuning, RankRAG's performance in medical QA tasks exceeds that of many open-source models in specialized fields.

With the introduction and continuous improvement of the RankRAG framework, we have every reason to believe that the future of AI and human collaboration in creativity will be even brighter. Whether it's independent developers or researchers, this innovative framework can inspire more creativity and possibilities, driving the development of technology and applications.

Paper address: https://arxiv.org/abs/2407.02485

AI Daily: Tencent Huyaun Launches 3D Generation Large Model Hunyuan3D-PolyGen; DingTalk AI Spreadsheet Makes a Big Entry; Alibaba Launches Multimodal Large Language Model HumanOmniV2

1.Tencent's Hunyuan3D-PolyGen boosts 3D modeling efficiency by 70% with BPT tech. 2.Alibaba's HumanOmniV2 achieves 69.33% accuracy in multilingual input. 3.DingTalk AI processes 1k tasks/hour with 'spreadsheet-as-document'. 4.Baidu PaddleOCR3.1 improves 37-language recognition by 30%. 5.Microsoft Deep Research opens API. 6.HKPolyU & OPPO's DLoRAL speeds video enhancement 10x. 7.Google opens MCP Toolbox for SQL. 8.Microsoft Win11 to add AI dynamic....

Grok4 to be released: Musk confirms X platform live stream on Wednesday night

Elon Musk announced that xAI's new generation large model Grok4 will be released at 8 PM (11 PM Beijing Time on Thursday) this Wednesday, and the launch will be live-streamed on the X platform. Musk previously revealed that Grok has seen significant improvements, and this release will showcase xAI's latest breakthroughs in the AI field.

iFlytek's Super Human-like Interactive API is Officially Launched on iFlytek Open Platform

In August 2024, iFlytek officially launched the Starfire Ultra Human-like Interactive Technology. Through end-to-end speech modeling and multi-dimensional emotional disentanglement training, it achieves three core breakthroughs: response speed, emotional resonance, and controllable speech expression. This technology can accurately detect emotional fluctuations in user speech and respond with appropriate tone in real time, while supporting dynamic adjustment of speech rate, voice, and character settings. It marks a significant leap from 'functional implementation' to 'emotional connection' in voice interaction. Currently, the Super Human-like Interactive API has been officially launched on the iFlytek Open Platform, allowing developers to access the technology at a low cost.

Tencent Hunyuan Launches the Industry's First Art-Level 3D Generation Large Model Hunyuan3D-PolyGen

On July 7, the Tencent Hunyuan 3D team announced the launch of the industry's first art-level 3D generation large model, Hunyuan3D-PolyGen. By employing self-developed high-compression representation BPT technology and a autoregressive mesh generation framework, it enables accurate generation of complex geometric models with up to ten thousand faces. The model has breakthrough solutions for core pain points in 3D asset generation, such as poor topology quality, excessive face count, and difficulty in post-editing. It has improved the modeling efficiency of artists by over 70%. The relevant capabilities have been launched on the Tencent Hunyuan 3D AI creation engine and integrated into multiple game pipelines. Traditional

Tencent Sets a New High! The First Art-Level 3D Generation Large Model Makes a Stunning Debut, Enhancing Modeling Efficiency by Over 70%!

Tencent launched Hunyuan3D-PolyGen, the industry's first art-grade 3D generation model, using self-developed BPT technology to enhance wiring quality and complex object modeling. It generates high-precision geometric models, supports multiple surface types, and boosts gaming pipeline efficiency by 70+%.....

Product Finder

Product Submit

AI Models Finder

MCP Servers

MCP Client

MCP Inspector

Case Tutorials

Latest AI News

AI Daily Brief

Supercharging Llama 3: NVIDIA Unveils New Tuning Framework, RankRAG, Surpassing GPT-4!

AIbase

This article is from AIbase Daily

AI News Recommendations

AI Daily: Tencent Huyaun Launches 3D Generation Large Model Hunyuan3D-PolyGen; DingTalk AI Spreadsheet Makes a Big Entry; Alibaba Launches Multimodal Large Language Model HumanOmniV2

Grok4 to be released: Musk confirms X platform live stream on Wednesday night

Concerns About AI Training in Germany: 70% of Employees Lack Access to Training, Companies May Be in Violation

iFlytek's Super Human-like Interactive API is Officially Launched on iFlytek Open Platform

Tencent Hunyuan Launches the Industry's First Art-Level 3D Generation Large Model Hunyuan3D-PolyGen

Tencent Sets a New High! The First Art-Level 3D Generation Large Model Makes a Stunning Debut, Enhancing Modeling Efficiency by Over 70%!

Claude is about to release the Claude Neptune v3 model with strong mathematical capabilities

Tencent Open-Sourced Huan Yuan-A13B: A Dynamic Inference Large Model, Focused on Thinking

B站 Launches HAI Creation Tool, Fully Expanding into Video Podcasts

B站AniSora V3 Launches with a Strong Impact: A Faster and More Efficient Anime Video Generation Tool