Gemma-2B-10M

The Gemma 2B model supports 10M sequence length, optimizes memory usage, and is suitable for large-scale language model applications.

CommonProductProgrammingLanguage ModelAttention Mechanism
The Gemma 2B - 10M Context is a large-scale language model that, through innovative attention mechanism optimization, can process sequences up to 10M long with memory usage less than 32GB. The model employs recurrent localized attention technology, inspired by the Transformer-XL paper, making it a powerful tool for handling large-scale language tasks.
Visit

Gemma-2B-10M Visit Over Time

Monthly Visits

18200568

Bounce Rate

44.11%

Page per Visit

5.8

Visit Duration

00:05:46

Gemma-2B-10M Visit Trend

Gemma-2B-10M Visit Geography

Gemma-2B-10M Traffic Sources

Gemma-2B-10M Alternatives