Skip to main content

vLLM Semantic Router fleet simulator for capacity planning, SLO validation, and what-if analysis

Project description

vllm-sr-sim

vllm-sr-sim is the maintained fleet simulator for this repository. It sizes heterogeneous GPU fleets, evaluates routing strategies, and exposes a service mode that the dashboard can call across containers.

Repository-maintained docs now live in the website:

Install

cd src/fleet-sim
pip install -e .

Install the service extras when you want to run the simulator API:

pip install -e .[api]

For local development and tests:

pip install -e .[dev]

CLI

vllm-sr-sim --version

vllm-sr-sim optimize \
  --cdf data/azure_cdf.json \
  --lam 200 --slo 500 --b-short 6144 \
  --verify-top 3 --n-sim-req 30000

vllm-sr-sim whatif \
  --cdf data/azure_cdf.json \
  --lam-range 50 100 200 500 1000 \
  --slo 500 --b-short 6144

vllm-sr-sim serve --host 0.0.0.0 --port 8000

vllm-sr serve also starts vllm-sr-sim by default as a sibling container on the shared runtime network so the dashboard can proxy it without rebuilding the router image.

Layout

  • fleet_sim/: simulation engine, optimizers, routing, hardware, workload, and service package
  • run_sim.py: unified CLI entrypoint used by vllm-sr-sim
  • tests/: simulator and service test coverage
  • data/: reference workload traces used by the examples and dashboard integration
  • examples/: sample scripts and multi-pool input files

Docs

Long-form simulator docs are maintained in the repository website. Keep the package README focused on installation, CLI usage, and source layout.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vllm_sr_sim-0.1.0.dev20260402103831.tar.gz (115.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

vllm_sr_sim-0.1.0.dev20260402103831-py3-none-any.whl (112.5 kB view details)

Uploaded Python 3

File details

Details for the file vllm_sr_sim-0.1.0.dev20260402103831.tar.gz.

File metadata

File hashes

Hashes for vllm_sr_sim-0.1.0.dev20260402103831.tar.gz
Algorithm Hash digest
SHA256 b60e4c67a1b4b6cb12bfa745d75daca27a893529ba3d0e7bbe853789d952c44c
MD5 94452fa736c9990d46c59ad4cc14596c
BLAKE2b-256 2be2b91c197ec0d8c9a7c49dbcc79d48d8019432b47f16fbef9350f5b9f11e86

See more details on using hashes here.

File details

Details for the file vllm_sr_sim-0.1.0.dev20260402103831-py3-none-any.whl.

File metadata

File hashes

Hashes for vllm_sr_sim-0.1.0.dev20260402103831-py3-none-any.whl
Algorithm Hash digest
SHA256 ffe1659d4d524dd314b54d66d2bd942d75c5665dd2520341ada0d3388684690d
MD5 012a1a66f0b69214b6b18e7907210f73
BLAKE2b-256 aba60bdc60329075239761afea0cdb4a53555d5d9d1eda1c881b779a444a5dc4

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page