Samba

Official implementation of an efficient infinite context language model

PremiumNewProductProgrammingNatural Language ProcessingMachine Learning
Samba is a simple yet powerful hybrid model with infinite context length. Its architecture is straightforward: Samba = Mamba + MLP + Sliding Window Attention + Hierarchical MLP Stacking. The Samba-3.8B model was trained on the Phi3 dataset with 32 trillion tokens and significantly outperformed Phi3-mini on major benchmark tests (e.g., MMLU, GSM8K, and HumanEval). Samba can also achieve perfect long-context retrieval ability with minimal instruction tuning while maintaining linear complexity with respect to the sequence length. This enables Samba-3.8B-instruct to excel in downstream tasks such as long-context summarization.
Visit

Samba Visit Over Time

Monthly Visits

503747431

Bounce Rate

37.31%

Page per Visit

5.7

Visit Duration

00:06:44

Samba Visit Trend

Samba Visit Geography

Samba Traffic Sources

Samba Alternatives