RouteLLM is a framework designed for servicing and evaluating routers for large language models (LLMs). It intelligently routes queries to different models based on cost and performance, preserving response quality while achieving cost savings. With out-of-the-box routers, it has demonstrated up to 85% cost reduction and 95% performance of GPT-4 in widely used benchmarks.