RouteLLM is a powerful framework for serving and evaluating LLM routers, offering drop-in replacement for OpenAI’s client to route simpler queries to cheaper models. Trained routers reduce costs by up to 85% while maintaining high performance. The framework allows easy extension to include new routers and compare performance across multiple benchmarks. Users can set cost thresholds for requests, calibrate based on query types, and efficiently route between strong and weak models. The lightweight OpenAI-compatible server and evaluation framework enhance the versatility of RouteLLM. With support for various routers and benchmarks, RouteLLM offers a practical solution to balancing cost and quality in deploying LLMs.
https://github.com/lm-sys/RouteLLM