AI News

Don't miss any moment of global AI innovation

AI Daily

Daily three-minute AI industry trends

AI Timeline

AI industry milestones

Al Hardware

Lists all AI hardware products.

AI Monetization Guide

Latest Cases

AI monetization case sharing

Image Collection

AI image creation monetization cases

Video Collection

AI video creation monetization cases

Audio Collection

AI audio creation monetization cases

Content Collection

AI content writing monetization cases

AI Tutorials

Latest Tutorials

Free sharing of the latest AI tutorials

AI Product Rankings

AI Product Ranking

Shows total visits ranking of AI websites

AI Traffic Growth Ranking

Track fastest growing AI websites by traffic

AI Traffic Decline Ranking

Focus on AI websites with significant traffic drops

AI Weekly Ranking

Shows weekly visits ranking of AI websites

Popular Country Rankings

United States

AI websites most popular with US users

China

AI websites most popular with Chinese users

India

AI websites most popular with Indian users

Brazil

AI websites most popular with Brazilian users

Popular Category Rankings

Image Generation

Total visits ranking of AI image generation websites

Personal Assistant

Total visits ranking of AI personal assistant websites

Character Generation

Total visits ranking of AI character generation websites

Video Generation

Total visits ranking of AI video generation websites

Popular Open Source Data Rankings

AI Project Ranking

GitHub popular AI projects by total stars

AI Project Growth Ranking

GitHub popular AI projects by growth rate

AI Developer Ranking

GitHub popular AI developer ranking

AI Organization Ranking

GitHub popular AI organization ranking

Popular Open Source Categories

Deepseek

GitHub popular deepseek open source projects

TTS

GitHub popular TTS open source projects

LLM

GitHub popular LLM open source projects

ChatGPT

GitHub popular ChatGPT open source projects

AI Open Source Project Library

Overview

Overview of GitHub popular AI open source projects

Product Library Tool Navigation

Peking University Team Proposes Novel Framework LIFT to Inject Long-Context Knowledge into Model Parameters

AIbase基地

Published inAI News · 5 min read · Mar 17, 2025

A team from Peking University, led by Zhang Muhan, has proposed a novel framework—Long Input Fine-Tuning (LIFT)—that enables any short-context window model to handle long texts by training the long input text into the model parameters. This approach subverts traditional long text processing methods, shifting focus from endlessly expanding context windows to internalizing long-text knowledge into model parameters, mirroring how humans convert working memory into long-term memory.

Large language models currently face two major challenges in processing long texts:

The quadratic complexity of traditional attention mechanisms leads to enormous computational and memory overhead when processing long texts. Models struggle to understand long-range dependencies scattered throughout long texts.

Existing solutions like RAG and long-context adaptation have limitations:

RAG relies on accurate retrieval and is prone to noise, leading to hallucinations. Long-context adaptation has high inference complexity and still has limited context windows.

LIFT's Technological Innovation

The LIFT framework comprises three key components:

Dynamic and Efficient Long Input Training

Through segmented language modeling, long texts are divided into overlapping segments. This avoids the increased inference complexity and loss of long-range dependencies caused by excessively long contexts. The training complexity increases linearly with the length of the long text.

A Gated Memory Adapter for Balancing Model Capabilities

A dedicated Gated Memory Adapter architecture dynamically balances the original model's in-context learning capabilities and its understanding of long-input memory. This allows the model to automatically adjust how much LIFT memory content to use based on the query.

Auxiliary Task Training

Pre-trained LLMs are used to automatically generate question-answering auxiliary tasks based on long texts. This compensates for potential capabilities lost during segmented training and helps the model learn to use information from long texts to answer questions.

Experimental Results

LIFT achieved significant improvements on several long-context benchmarks:

LooGLE long-dependency question answering: Llama38B's accuracy increased from 15.44% to 29.97%. LooGLE short-dependency question answering: Gemma29B's accuracy increased from 37.37% to 50.33%. LongBench sub-tasks: Llama3, using LIFT, showed significant improvements in 4 out of 5 sub-tasks.

Ablation experiments showed that the Gated Memory architecture improved the GPT-4 score on the LooGLE ShortQA dataset by 5.48% compared to the original model fine-tuned using PiSSA.

Limitations and Future Directions

Despite LIFT's significant achievements, some limitations remain:

It still performs poorly on "needle-in-a-haystack" tasks requiring precise information extraction. The model's ability to extract parameterized knowledge obtained by LIFT needs optimization. The design of auxiliary tasks heavily relies on downstream testing tasks, limiting its generality. How to better balance memory and original capabilities remains a key research focus.

The research team encourages the community to explore LIFT's potential with broader training data, richer models, more advanced auxiliary task designs, and stronger computational resources.

Conclusion

LIFT offers a novel paradigm for long-text processing, transforming contextual knowledge into parameterized knowledge, similar to how humans convert short-term memory into long-term memory. While a complete solution to the long-context challenge remains elusive, LIFT opens up a highly promising research direction.

Paper Address: https://arxiv.org/abs/2502.14644

LongInputFine-Tuning(LIFT)LargeModel LongTextProcessing PekingUniversity

This article is from AIbase Daily

Welcome to the [AI Daily] column! This is your daily guide to exploring the world of artificial intelligence. Every day, we present you with hot topics in the AI field, focusing on developers, helping you understand technical trends, and learning about innovative AI product applications.

—— Created by the AIbase Daily Team

AI News Recommendations

Xiaopeng Announces In-House Turing AI Chip for Q2 Launch, Supporting 30B-Parameter Large Models

Xiaopeng Motors chairman He Xiaopeng recently announced that the company's fully self-developed Turing AI chip will be mass-produced and launched in the second quarter of this year. This progress comes as the automotive industry accelerates the application of end-to-end intelligent driving technology and the scale of AI large models continues to expand. Xiaopeng Motors is building its strongest AI brain by simultaneously developing a world base model with 35 times the parameters of mainstream VLA models, and a self-developed chip with computing power equivalent to three Nvidia Orin Xs, which is about to be mass-produced.

Apr 15, 2025

120

Tencent Cloud's Large Model Knowledge Engine Upgrades MCP Protocol, Ushering in a New Era for AI Applications

Apr 15, 2025

150

Zhihu AI Officially Initiates IPO Process; A New Chapter for the 'Big Six' in Large Language Models

Zhihu AI, a leading player in the Chinese large language model market, has officially begun its initial public offering (IPO) process, marking a significant milestone for the industry's 'Big Six' companies.

Apr 15, 2025

310

XPeng Executive Comments on Tesla FSD Entering China: XPeng Understands Chinese Road Conditions Better

Yesterday, XPeng Motors held a technical sharing session on AI large language models. Li Liyun, head of autonomous driving, stated that breakthroughs in AI large language model technology are pushing autonomous driving towards practical implementation. He mentioned that current autonomous driving technology has unprecedented potential for realization, and the industry is entering a critical turning point.

Apr 15, 2025

SenseCore 2.0, SenseTime's Large-Scale AI Infrastructure, Receives Major Upgrade and Launches a $10 Million Voucher Program

At the 2025 SenseTime Technology Exchange Day in Beijing, SenseTime officially announced a comprehensive upgrade to its SenseCore 2.0 large-scale AI infrastructure. As a leader in AI infrastructure, SenseCore 2.0 aims to provide businesses with agile, flexible, and reliable full-stack AI infrastructure services, driving the efficient implementation and large-scale application of large-scale models at an optimal price-performance ratio.

Apr 14, 2025

300

ByteDance Open-Sources Multi-SWE-bench to Drive Intelligent Upgrades for Large Model Code

Apr 10, 2025

300

AI Daily: Alibaba's Qwen3 Model Imminent; GitHub Opensources MCP Server; Runway Releases Gen-4 Turbo

Welcome to the AI Daily column! Your daily guide to exploring the world of artificial intelligence. We present you with the hottest topics in the AI field, focusing on developers and helping you understand technology trends and innovative AI product applications. Discover new AI products here: https://top.aibase.com/1、Qwen3 is coming soon: Support for Alibaba Cloud's new model has been officially merged into the vLLM code repository. Alibaba Cloud's Qwen3 model is about to be released, marking another significant advancement in its AI endeavors.

Apr 8, 2025

620

Wuling Launches Lingyu Cockpit: An Intelligent Driving Cabin that Breaks Down Communication Barriers

Apr 8, 2025

160

Google Research: Synthetic Data Boosts Large Model Math Reasoning Eightfold

A recent joint study by Google, Carnegie Mellon University, and MultiOn explores the application of synthetic data in training large language models. According to Epoch AI, a research institution focused on AI development, currently available high-quality text training data totals around 300 trillion tokens. However, with the rapid advancement of large models like ChatGPT, the demand for training data is growing exponentially, projected to exhaust existing resources by 2026. Therefore, synthetic data is becoming increasingly crucial.

Apr 7, 2025

450

Guangxi High-Speed AI Traffic Model Deployed During Qingming Festival to Alleviate Congestion

During the Qingming Festival holiday, Guangxi province implemented an AI-powered traffic model on its highways to manage and alleviate traffic congestion.

Apr 6, 2025

330