XVERSE-MoE-A36B is a large multilingual language model independently developed by Shenzhen Yuanxiang Technology. It employs a Mixture of Experts (MoE) architecture, with a total parameter size of 255.4 billion and 36 billion active parameters. The model supports over 40 languages, including Chinese, English, Russian, and Spanish, excelling particularly in bilingual scenarios. It is trained on 8K-length samples, and through a refined data sampling ratio and dynamic data switching strategy, it ensures high quality and diversity in its output. Furthermore, the model has been custom optimized for the MoE architecture, enhancing computational efficiency and overall throughput.