Chinese Tiny LLM

The first Chinese large language model, focusing on Chinese understanding and generation.

PremiumNewProductProductivityChineseLanguage Model

Chinese Tiny LLM (CT-LLM) is the first large language model designed specifically for Chinese. It boasts 2 billion parameters and has been pre-trained on a 120 billion Chinese text corpus. CT-LLM prioritizes enhancing the understanding and generation of the Chinese language. Through pre-training on massive Chinese data, it achieves efficient processing of Chinese text. While optimized for Chinese processing, CT-LLM also demonstrates proficiency in handling English and programming code, showcasing the model's cross-lingual adaptability. In the Chinese language benchmark CHC-Bench, CT-LLM exhibits outstanding performance, proving its efficiency in understanding and applying Chinese. CT-LLM is trained from scratch, primarily using Chinese data for pre-training. It openly shares all relevant information, including the entire data filtering process, training dynamics, training and evaluation data, and intermediate model checkpoints. This open-source approach allows other researchers and developers to access these resources, leveraging them for their own research or further model refinement.

Product Finder

Product Submit

AI Models Finder

MCP Servers

MCP Client

MCP Inspector

Case Tutorials

Latest AI News

AI Daily Brief

Chinese Tiny LLM

Chinese Tiny LLM Visit Over Time

Chinese Tiny LLM Visit Trend

Chinese Tiny LLM Visit Geography

Chinese Tiny LLM Traffic Sources

Chinese Tiny LLM Alternatives

LingoWhale-8B — An open-source bilingual (Chinese-English) pre-trained language model.

Meta Llama 3.1-405B — Large multilingual pre-trained language model

Chinese Tiny LLM — The first Chinese large language model, focusing on Chinese understanding and generation.

GLM-4V-9B — Open-source multimodal pre-trained model with English and Chinese dialogue capabilities.

Qwen1.5-32B — A series of Transformer-based pre-trained language models

Meta Llama 3.3 — A multilingual large pre-trained language model with 70 billion parameters.

Gemma-2b — An open-source pre-trained language model released by Google

Chronos — A pre-trained time series forecasting model based on a language model architecture

CogVLM2 — Second-generation multimodal pre-trained dialogue model

Qwen2 — A next-generation multilingual pre-trained model with exceptional performance.

GLM-4-9B — A new generation of open-source pre-trained model, supporting multiple languages and advanced features

SpacTor-T5 — Pre-trained T5 model using a combination of span corruption (SC) and replacement tag detection (RTD).

ViTLP — A visually guided generative text layout pre-trained model for document intelligence.

timesfm-2.0-500m-pytorch — A pre-trained time series forecasting model developed by Google Research.

GLM-4-9B-Chat — A new generation of multilingual pre-trained model, supporting long text and code execution.

GLM-4-9B-Chat-1M — A new generation of open-source pre-trained model supporting multi-turn dialogue and multilingualism.

Mixtral-8x22B — A large language model based on a sparse expert framework.

Gemma-7B — A 70-Billion Parameter Language Model by Google

Stable Code 3B — Stable Code 3B - A pre-trained language model for text generation

Beagle14-7B — Powerful Chinese language model

1.5-Pints — A compact large language model pre-trained in 9 days

InternVL2_5-26B — A large multimodal language model that integrates visual and linguistic understanding.

Index-1.9B-Chat — A 1.9B parameter dialogue generation model

Llama3 — Large language model supporting various parameter sizes.

YAYI-UIE Information Extraction Large Model — High-quality information extraction model based on massive data

OLMoE — An open-source expert mixture language model with 130 million active parameters.

Index-1.9B-Pure — A lightweight large language model focused on text generation.

TinyLlama — The TinyLlama project aims to pre-train a 1.1B Llama model on 3 trillion tokens. With some optimizations, we can achieve this in just 90 days using 16 A100-40G GPUs. Training began on 2023-09-01.

GLM-4 Series — Open-source multilingual multimodal dialogue model

Gemma-2-27B-Chinese-Chat — The first instruction-tuned language model for Chinese and English users