BitNet

Inference framework for 1-bit large language models

CommonProductProgrammingLarge Language ModelsInference Framework
BitNet is an official inference framework developed by Microsoft, designed specifically for 1-bit large language models (LLMs). It provides a set of optimized core features that support fast and lossless 1.58-bit model inference on CPUs (with NPU and GPU support coming soon). BitNet achieves speedups ranging from 1.37x to 5.07x on ARM CPUs, with energy efficiency gains of 55.4% to 70.0%. On x86 CPUs, speed improvements range from 2.37x to 6.17x, and the energy efficiency ratio increases from 71.9% to 82.2%. Additionally, BitNet can run the 100B parameter BitNet b1.58 model on a single CPU, achieving inference speeds close to human reading rates, thus expanding the possibilities of running large language models on local devices.
Visit

BitNet Visit Over Time

Monthly Visits

494758773

Bounce Rate

37.69%

Page per Visit

5.7

Visit Duration

00:06:29

BitNet Visit Trend

BitNet Visit Geography

BitNet Traffic Sources

BitNet Alternatives