en
每月不到10元,就可以无限制地访问最好的AIbase。立即成为会员
Home
News
Daily Brief
Income Guide
Tutorial
Tools Directory
Product Library
en
Search AI Products and News
Explore worldwide AI information, discover new AI opportunities
AI News
AI Tools
AI Cases
AI Tutorial
Type :
AI News
AI Tools
AI Cases
AI Tutorial
2024-08-13 13:47:55
.
AIbase
.
11.0k
New Breakthrough in GPU Optimization! 'Tree Attention' Accelerates Inference of 5 Million Long Texts by 8 Times
The Transformer architecture, a star in the field of artificial intelligence, has led a revolution in natural language processing with its self-attention mechanism at its core. However, when handling long contexts, the resource consumption of self-attention calculations becomes a bottleneck. To address this issue, researchers have proposed the Tree Attention method, which decomposes the calculation tasks through tree reduction to improve efficiency. This method not only reduces communication overhead and memory usage but is also 8 times faster than existing methods in a multi-GPU environment. The introduction of Tree Attention not only enhances performance.
2024-07-31 11:27:05
.
AIbase
.
10.7k
Zyphra Launches Small Language Model Zamba2-2.7B: Speed Doubled, Memory Cost Reduced by 27%
Zyphra has launched the Zamba2-2.7B language model, which is a milestone in the small language model domain. Its performance and efficiency have significantly improved, with a training dataset size of around 30 trillion tokens, reducing resource requirements during inference and making it an efficient solution for mobile applications. Key highlights include a twofold increase in response generation speed, a 27% reduction in memory usage, and a 1.29 times decrease in generation latency, particularly suited for applications requiring real-time interaction, such as virtual assistants and chatbots.