BitNet
Inference framework for 1-bit large language models
CommonProductProgrammingLarge Language ModelsInference Framework
BitNet is an official inference framework developed by Microsoft, designed specifically for 1-bit large language models (LLMs). It provides a set of optimized core features that support fast and lossless 1.58-bit model inference on CPUs (with NPU and GPU support coming soon). BitNet achieves speedups ranging from 1.37x to 5.07x on ARM CPUs, with energy efficiency gains of 55.4% to 70.0%. On x86 CPUs, speed improvements range from 2.37x to 6.17x, and the energy efficiency ratio increases from 71.9% to 82.2%. Additionally, BitNet can run the 100B parameter BitNet b1.58 model on a single CPU, achieving inference speeds close to human reading rates, thus expanding the possibilities of running large language models on local devices.
BitNet Visit Over Time
Monthly Visits
515580771
Bounce Rate
37.20%
Page per Visit
5.8
Visit Duration
00:06:42