AIbase
Product LibraryTool Navigation

GPTQModel

Public

Production ready LLM model compression/quantization toolkit with hw accelerated inference support for both cpu/gpu via HF, vLLM, and SGLang.

Creat2024-06-17T14:45:30
Update2025-03-27T10:23:06
https://x.com/ModelCloudAI
419
Stars
0
Stars Increase

Related projects