Recently, the ModelScope community has partnered with vLLM and FastChat to jointly offer Chinese developers faster and more efficient LLM inference and deployment services. Developers can utilize vLLM as the inference engine within FastChat, providing high-throughput model inference. FastChat is an open platform designed for training, serving, and evaluating ChatBot systems based on LLMs. vLLM, developed by researchers from the University of California, Berkeley, Stanford University, and the University of California, San Diego, is an LLM service system. Through FastChat and vLLM, developers can quickly load models from ModelScope for inference.