Baichuan 3, a large language model with over trillion parameters developed by Baichuan Intelligent, has demonstrated outstanding performance in multiple authoritative general ability assessments, particularly exceeding GPT-4 in Chinese tasks. It excels in natural language processing, code generation, and medical tasks. It employs several innovative techniques to enhance model capabilities, including dynamic data selection, importance preservation, and asynchronous Checkpoint storage. The training process utilizes a dynamic data selection scheme based on causal sampling to ensure data quality. An importance preservation progressive initialization method is introduced to optimize model training stability. A series of optimizations have also been implemented for parallel training, resulting in a performance improvement of over 30%.