Web LLM is a modular, customizable JavaScript package that allows you to directly integrate language model chat into web browsers. Everything runs within the browser, without requiring server support, and is accelerated via WebGPU. It offers exciting opportunities to build AI assistants and protects privacy while enjoying GPU acceleration. This project is an offshoot of MLC LLM, which allows LLM to run locally on iPhones and other local environments.