Aria-Base-64K

Multimodal native Mixture-of-Experts model

CommonProductProductivityMultimodalLong text processing

Aria-Base-64K is one of the foundational models in the Aria series, designed for research purposes and further training. It emerged after the long text pre-training phase, trained on 33 billion tokens (21 billion multimodal and 12 billion language tokens, with 69% being long texts). It is suitable for further pre-training or fine-tuning on long video question answering datasets or long document question answering datasets, even in resource-constrained environments, through post-training with short instruction tuning datasets tailored for long text scenarios. The model can comprehend up to 250 high-resolution images or up to 500 medium-resolution images, maintaining robust foundational performance in both language and multimodal contexts.

Latest AI News

AI Daily Brief

AI Product Finder

AI Product Rankings

AI Product Submit

AI Tools Directory

AI Models Finder

LLM Leaderboard

Model Providers

Submit Your Model

Compare LLMs

LLM Cost Calculator

LLM Arena

MCP Servers

MCP Client

MCP Case Tutorials

MCP Ranking

MCP Service Submission

MCP Playground

MCP Inspector

GEO Services​

AI Search Visibility Checker

AI Model Compatibility Checker

AI Dataset Collection

Intelligent Document Recognition

Aria-Base-64K

Aria-Base-64K Visit Over Time

Aria-Base-64K Visit Trend

Aria-Base-64K Visit Geography

Aria-Base-64K Traffic Sources

Aria-Base-64K Alternatives

CogVLM2 — Second-generation multimodal pre-trained dialogue model

GLM-4-9B-Chat — A new generation of multilingual pre-trained model, supporting long text and code execution.

Qwen2 — A next-generation multilingual pre-trained model with exceptional performance.

Meta Llama 3.1-405B — Large multilingual pre-trained language model

GLM-4V-9B — Open-source multimodal pre-trained model with English and Chinese dialogue capabilities.

Gemma-2b — An open-source pre-trained language model released by Google

ViTLP — A visually guided generative text layout pre-trained model for document intelligence.

Qwen1.5-32B — A series of Transformer-based pre-trained language models

Meta Llama 3.3 — A multilingual large pre-trained language model with 70 billion parameters.

LingoWhale-8B — An open-source bilingual (Chinese-English) pre-trained language model.

Aria-Base-64K — Multimodal native Mixture-of-Experts model

GLM-4-9B-Chat-1M — A new generation of open-source pre-trained model supporting multi-turn dialogue and multilingualism.

Chronos — A pre-trained time series forecasting model based on a language model architecture

SpacTor-T5 — Pre-trained T5 model using a combination of span corruption (SC) and replacement tag detection (RTD).

GLM-4-9B — A new generation of open-source pre-trained model, supporting multiple languages and advanced features

olmo-mix-1124 — Large-scale multimodal pre-training dataset

timesfm-2.0-500m-pytorch — A pre-trained time series forecasting model developed by Google Research.

GLM-4 Series — Open-source multilingual multimodal dialogue model

MM1 — Apple released its multimodal LLM model, MM1

InternVL2_5-26B — A large multimodal language model that integrates visual and linguistic understanding.

Google T5 — Unified Text-to-Text Transformer

Jamba 1.5 Open Model Family — High-performance AI model for long text processing

Llama-3.2-11B-Vision — A multimodal large language model that supports image and text processing.

OpenEMMA — An open-source end-to-end multimodal model for autonomous driving.

ModernBERT-large — High-performance bidirectional encoder Transformer model

Index-1.9B-Pure — A lightweight large language model focused on text generation.

Mixtral-8x22B — A large language model based on a sparse expert framework.

Stable Code 3B — Stable Code 3B - A pre-trained language model for text generation

AI21-Jamba-1.5-Mini — High-performance long text processing AI model

TinyLlama — The TinyLlama project aims to pre-train a 1.1B Llama model on 3 trillion tokens. With some optimizations, we can achieve this in just 90 days using 16 A100-40G GPUs. Training began on 2023-09-01.

GEO Services