Aria-Base-64K
Multimodal native Mixture-of-Experts model
CommonProductProductivityMultimodalLong text processing
Aria-Base-64K is one of the foundational models in the Aria series, designed for research purposes and further training. It emerged after the long text pre-training phase, trained on 33 billion tokens (21 billion multimodal and 12 billion language tokens, with 69% being long texts). It is suitable for further pre-training or fine-tuning on long video question answering datasets or long document question answering datasets, even in resource-constrained environments, through post-training with short instruction tuning datasets tailored for long text scenarios. The model can comprehend up to 250 high-resolution images or up to 500 medium-resolution images, maintaining robust foundational performance in both language and multimodal contexts.
Aria-Base-64K Visit Over Time
Monthly Visits
19075321
Bounce Rate
45.07%
Page per Visit
5.5
Visit Duration
00:05:32