SpS-SpecDec
PublicSpS-SpecDec: a fast Python lib that boosts autoregressive LM inference with speculative decoding. Inspired by DeepMind, it guesses multiple tokens using a small draft model, verifies with a big one. Get 2-2.5x speedups, no quality drop!