Infini-attention

Extends the Transformer model to handle infinitely long inputs

CommonProductOthersTransformerLarge Language Model
Google's Infini-attention technology aims to extend Transformer-based large language models to handle infinitely long inputs. It achieves this by utilizing a compressed memory mechanism and has demonstrated excellent performance on multiple long-sequence tasks. The technique includes a compressed memory mechanism, the combination of local and long-range attention, and streaming capabilities. Experimental results show performance advantages in long-context language modeling, key-context block retrieval, and book summarization tasks.
Visit

Infini-attention Visit Over Time

Monthly Visits

20208007

Bounce Rate

44.64%

Page per Visit

3.1

Visit Duration

00:04:14

Infini-attention Visit Trend

Infini-attention Visit Geography

Infini-attention Traffic Sources

Infini-attention Alternatives