Samba
Official implementation of an efficient infinite context language model
PremiumNewProductProgrammingNatural Language ProcessingMachine Learning
Samba is a simple yet powerful hybrid model with infinite context length. Its architecture is straightforward: Samba = Mamba + MLP + Sliding Window Attention + Hierarchical MLP Stacking. The Samba-3.8B model was trained on the Phi3 dataset with 32 trillion tokens and significantly outperformed Phi3-mini on major benchmark tests (e.g., MMLU, GSM8K, and HumanEval). Samba can also achieve perfect long-context retrieval ability with minimal instruction tuning while maintaining linear complexity with respect to the sequence length. This enables Samba-3.8B-instruct to excel in downstream tasks such as long-context summarization.
Samba Visit Over Time
Monthly Visits
515580771
Bounce Rate
37.20%
Page per Visit
5.8
Visit Duration
00:06:42