Skip to main content

vLLM Semantic Router fleet simulator for capacity planning, SLO validation, and what-if analysis

Project description

vllm-sr-sim

vllm-sr-sim is the maintained fleet simulator for this repository. It sizes heterogeneous GPU fleets, evaluates routing strategies, and exposes a service mode that the dashboard can call across containers.

Repository-maintained docs now live in the website:

Install

cd src/fleet-sim
pip install -e .

Install the service extras when you want to run the simulator API:

pip install -e .[api]

For local development and tests:

pip install -e .[dev]

CLI

vllm-sr-sim --version

vllm-sr-sim optimize \
  --cdf data/azure_cdf.json \
  --lam 200 --slo 500 --b-short 6144 \
  --verify-top 3 --n-sim-req 30000

vllm-sr-sim whatif \
  --cdf data/azure_cdf.json \
  --lam-range 50 100 200 500 1000 \
  --slo 500 --b-short 6144

vllm-sr-sim serve --host 0.0.0.0 --port 8000

vllm-sr serve also starts vllm-sr-sim by default as a sibling container on the shared runtime network so the dashboard can proxy it without rebuilding the router image.

Layout

  • fleet_sim/: simulation engine, optimizers, routing, hardware, workload, and service package
  • run_sim.py: unified CLI entrypoint used by vllm-sr-sim
  • tests/: simulator and service test coverage
  • data/: reference workload traces used by the examples and dashboard integration
  • examples/: sample scripts and multi-pool input files

Docs

Long-form simulator docs are maintained in the repository website. Keep the package README focused on installation, CLI usage, and source layout.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vllm_sr_sim-0.1.0.dev20260319094233.tar.gz (115.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

vllm_sr_sim-0.1.0.dev20260319094233-py3-none-any.whl (112.5 kB view details)

Uploaded Python 3

File details

Details for the file vllm_sr_sim-0.1.0.dev20260319094233.tar.gz.

File metadata

File hashes

Hashes for vllm_sr_sim-0.1.0.dev20260319094233.tar.gz
Algorithm Hash digest
SHA256 c61424cb9f6e149d09881f38504c05c24d1776a002475538c324a17535c3b03d
MD5 8149f6f1427616d18a8667d609ed6a98
BLAKE2b-256 51158300dfcc34d2240b7ff8088e895562f5f98e4d72c428b862bb965c9264da

See more details on using hashes here.

File details

Details for the file vllm_sr_sim-0.1.0.dev20260319094233-py3-none-any.whl.

File metadata

File hashes

Hashes for vllm_sr_sim-0.1.0.dev20260319094233-py3-none-any.whl
Algorithm Hash digest
SHA256 7d1c64548fb8bbdf41e6dc2c6cbf0ea1273663ae5f448d4dab784b5988465542
MD5 3a9f94e46b2f8af261bb2054bbd231b5
BLAKE2b-256 00019cb4c6c0c21cb71534d03ee3b72a1ed1cb057abb49eee7478a416c239014

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page