The article mainly discusses the computational and storage challenges in the field of artificial intelligence, particularly regarding large language models (LLMs). With the enhancement of AI capabilities, such as the emergence of the Bloom model, the required computational resources and storage space have significantly increased, leading to high costs and accessibility issues. To tackle this problem, researchers have introduced 'quantization' techniques to reduce the model size and accelerate the running speed by lowering the precision of model weights and activations, while controlling potential accuracy loss. The article highlights the collaboration between Beihang University and SenseTime in developing this tool.