Infini-attention
Extends the Transformer model to handle infinitely long inputs
CommonProductOthersTransformerLarge Language Model
Google's Infini-attention technology aims to extend Transformer-based large language models to handle infinitely long inputs. It achieves this by utilizing a compressed memory mechanism and has demonstrated excellent performance on multiple long-sequence tasks. The technique includes a compressed memory mechanism, the combination of local and long-range attention, and streaming capabilities. Experimental results show performance advantages in long-context language modeling, key-context block retrieval, and book summarization tasks.
Infini-attention Visit Over Time
Monthly Visits
18167391
Bounce Rate
44.37%
Page per Visit
3.2
Visit Duration
00:04:16