BitNet is an official inference framework developed by Microsoft, designed specifically for 1-bit large language models (LLMs). It provides a set of optimized core features that support fast and lossless 1.58-bit model inference on CPUs (with NPU and GPU support coming soon). BitNet achieves speedups ranging from 1.37x to 5.07x on ARM CPUs, with energy efficiency gains of 55.4% to 70.0%. On x86 CPUs, speed improvements range from 2.37x to 6.17x, and the energy efficiency ratio increases from 71.9% to 82.2%. Additionally, BitNet can run the 100B parameter BitNet b1.58 model on a single CPU, achieving inference speeds close to human reading rates, thus expanding the possibilities of running large language models on local devices.