en
AI Ranking
每月不到10元,就可以无限制地访问最好的AIbase。立即成为会员
Home
News
Daily Brief
Income Guide
Tutorial
Tools Directory
Product Library
en
AI Ranking
Search AI Products and News
Explore worldwide AI information, discover new AI opportunities
AI News
AI Tools
AI Cases
AI Tutorial
Type :
AI News
AI Tools
AI Cases
AI Tutorial
2025-02-12 14:04:43
.
AIbase
.
15.3k
ByteDance's UltraMem Architecture Reduces Large Model Inference Costs by 83%
The ByteDance Doubao large model team announced today the successful development of a new sparse model architecture called UltraMem. This architecture effectively addresses the high memory access issues during the inference of MoE (Mixture of Experts) models, improving inference speed by 2 to 6 times compared to MoE, and reducing inference costs by up to 83%. This groundbreaking advancement opens a new path for efficient inference of large models. The UltraMem architecture successfully resolves the memory bottleneck during inference of MoE architectures while maintaining model performance. Experimental results show that the parameters and activation conditions are the same.