Chinese Tiny LLM
The first Chinese large language model, focusing on Chinese understanding and generation.
PremiumNewProductProductivityChineseLanguage Model
Chinese Tiny LLM (CT-LLM) is the first large language model designed specifically for Chinese. It boasts 2 billion parameters and has been pre-trained on a 120 billion Chinese text corpus. CT-LLM prioritizes enhancing the understanding and generation of the Chinese language. Through pre-training on massive Chinese data, it achieves efficient processing of Chinese text. While optimized for Chinese processing, CT-LLM also demonstrates proficiency in handling English and programming code, showcasing the model's cross-lingual adaptability. In the Chinese language benchmark CHC-Bench, CT-LLM exhibits outstanding performance, proving its efficiency in understanding and applying Chinese. CT-LLM is trained from scratch, primarily using Chinese data for pre-training. It openly shares all relevant information, including the entire data filtering process, training dynamics, training and evaluation data, and intermediate model checkpoints. This open-source approach allows other researchers and developers to access these resources, leveraging them for their own research or further model refinement.