Generate Shenron docker-compose deployments from model config files
Project description
Shenron
Shenron now ships as a config-driven generator for production LLM docker-compose deployments.
shenron reads a model config YAML and generates:
docker-compose.yml.generated/onwards_config.json.generated/prometheus.yml.generated/scouter_reporter.env.generated/engine_start.sh.generated/engine_start_N.sh+.generated/sglangmux_start.shwhen usingmodels:
Quick Start
uv pip install shenron
shenron get
docker compose up -d
shenron get reads a per-release config index asset, shows available configs with arrow-key selection, downloads the chosen config, and generates deployment artifacts in the current directory. Using --release latest also rewrites shenron_version in the downloaded config to latest. You can also override config values on download with:
--api-key(writesapi_key)--scouter-api-key(writesscouter_ingest_api_key)--scouter-collector-instance(writesscouter_collector_instance; alias:--scouter-colector-instance)
By default, shenron get pulls release configs from doublewordai/shenron-configs.
shenron . still works and expects exactly one config YAML (*.yml or *.yaml) in the current directory, unless you pass a config file path directly.
Configs
Repo configs are stored in configs/.
Available starter configs:
configs/Qwen06B-cu126-TP1.ymlconfigs/Qwen06B-cu129-TP1.ymlconfigs/Qwen06B-cu130-TP1.ymlconfigs/Qwen30B-A3B-cu126-TP1.ymlconfigs/Qwen30B-A3B-cu129-TP1.ymlconfigs/Qwen30B-A3B-cu129-TP2.ymlconfigs/Qwen30B-A3B-cu130-TP2.ymlconfigs/Qwen235-A22B-cu129-TP2.ymlconfigs/Qwen235-A22B-cu129-TP4.ymlconfigs/Qwen235-A22B-cu130-TP2.yml
This file uses the same defaults that were previously hardcoded in docker/run_docker_compose.sh.
Engine selection and args:
engine:vllmorsglang(default:vllm)vllm_args: vLLM CLI args appended after core settings. Use this for--gpu-memory-utilization,--scheduling-policy,--tool-call-parser,--override-generation-config, etc.sglang_args: SGLang CLI args appended after core settings (use for--tp,--dp,--ep,--enable-dp-attention, etc.)sglang_use_cuda_ipc_transport: whentrue, exportsSGLANG_USE_CUDA_IPC_TRANSPORT=1before launching SGLang.models: optional per-model overrides for multi-model SGLang mux mode.sglangmux_listen_port,sglangmux_host,sglangmux_upstream_timeout_secs,sglangmux_model_ready_timeout_secs,sglangmux_model_switch_timeout_secs,sglangmux_log_dir: optionalsglangmuxsettings (hyphenated aliases likesglangmux-listen-portare also accepted).
vllm_args and sglang_args accept YAML scalars (string/number/bool). If you need to pass a structured value (like --override-generation-config), provide a YAML mapping and it will be JSON-encoded.
Single Config models: Schema (SGLang + sglangmux)
When models: is set, Shenron generates one engine launch script per model plus a mux launcher:
engine: sglang
sglangmux_listen_port: 8100
sglangmux_host: 0.0.0.0
sglangmux_upstream_timeout_secs: 120
sglangmux_model_ready_timeout_secs: 600
sglangmux_model_switch_timeout_secs: 120
sglangmux_log_dir: /tmp/sglangmux
models:
- model_name: Qwen/Qwen3-0.6B
vllm_port: 8001
api_key: sk-model-a
sglang_args: [--tp, 1]
- model_name: Qwen/Qwen3-30B-A3B
vllm_port: 8002
api_key: sk-model-b
sglang_use_cuda_ipc_transport: true
sglang_args: [--tp, 2]
Rules in models: mode:
enginemust besglang- each
models[*].model_namemust be unique - each
models[*].vllm_portmust be set and unique sglangmux_listen_portmust be different from all model ports
In this mode, .generated/onwards_config.json contains one target per model and all target URLs point to http://vllm:<sglangmux_listen_port>/v1.
Generated Compose Behavior
docker-compose.yml is fully rendered from config values:
- model image tag from
shenron_version+cuda_version onwardsimage tag fromonwards_version- service ports from config
- no
${SHENRON_VERSION}placeholders
Development
# Run tests (Rust + CLI + compose checks)
./scripts/ci.sh
# Install local package for manual testing
python3 -m pip install -e .
# Generate from repo config
shenron configs/Qwen06B-cu126-TP1.yml --output-dir /tmp/shenron-test
Release Automation
release-assets.yamlpublishes stamped config files (*.yml) as release assets.release-assets.yamlalso publishesconfigs-index.txt, which powersshenron get.release-assets.yamlmirrors*.yml+configs-index.txtinto${OWNER}/shenron-configsunder the same tag as the mainshenronrelease.- Set
CONFIGS_REPO_TOKEN(or reuseRELEASE_PLEASE_TOKEN) with write access to the configs repo release assets; optional repo variableCONFIGS_REPOoverrides the default target (${OWNER}/shenron-configs). python-release.yamlbuilds/publishes theshenronpackage to PyPI on release tags.- Docker image build/push via Depot remains in
ci.yamland still triggers whendocker/Dockerfile.cu*orVERSIONchanges.
License
MIT, see LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file shenron-0.12.0.tar.gz.
File metadata
- Download URL: shenron-0.12.0.tar.gz
- Upload date:
- Size: 39.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6a6998c10f335f9e68d869ee4aa2d2c3f4beb742ea7b0b946613d56d6941ebac
|
|
| MD5 |
37855c62fd8db875587022f6f77098c3
|
|
| BLAKE2b-256 |
fff8cccf4f8736c77364adca41f507a2d0d7fb0c6b84077bc4eeb29e1222ddc4
|
Provenance
The following attestation bundles were made for shenron-0.12.0.tar.gz:
Publisher:
python-release.yaml on doublewordai/shenron
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
shenron-0.12.0.tar.gz -
Subject digest:
6a6998c10f335f9e68d869ee4aa2d2c3f4beb742ea7b0b946613d56d6941ebac - Sigstore transparency entry: 981787730
- Sigstore integration time:
-
Permalink:
doublewordai/shenron@f9b839f52ab0f25137991d744af39ba8e598ca10 -
Branch / Tag:
refs/tags/v0.12.0 - Owner: https://github.com/doublewordai
-
Access:
internal
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-release.yaml@f9b839f52ab0f25137991d744af39ba8e598ca10 -
Trigger Event:
push
-
Statement type:
File details
Details for the file shenron-0.12.0-cp311-cp311-manylinux_2_34_x86_64.whl.
File metadata
- Download URL: shenron-0.12.0-cp311-cp311-manylinux_2_34_x86_64.whl
- Upload date:
- Size: 501.7 kB
- Tags: CPython 3.11, manylinux: glibc 2.34+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
276968807c91a3d522ba80725fdcd61cde75752363ffd8654ecb95cefca57234
|
|
| MD5 |
b33fb73e06e5704e0b741bce22ee05c1
|
|
| BLAKE2b-256 |
286ae9d5476202e268ea589a1bb968c4495137f467424fc6fcc761f5a7430661
|
Provenance
The following attestation bundles were made for shenron-0.12.0-cp311-cp311-manylinux_2_34_x86_64.whl:
Publisher:
python-release.yaml on doublewordai/shenron
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
shenron-0.12.0-cp311-cp311-manylinux_2_34_x86_64.whl -
Subject digest:
276968807c91a3d522ba80725fdcd61cde75752363ffd8654ecb95cefca57234 - Sigstore transparency entry: 981787785
- Sigstore integration time:
-
Permalink:
doublewordai/shenron@f9b839f52ab0f25137991d744af39ba8e598ca10 -
Branch / Tag:
refs/tags/v0.12.0 - Owner: https://github.com/doublewordai
-
Access:
internal
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-release.yaml@f9b839f52ab0f25137991d744af39ba8e598ca10 -
Trigger Event:
push
-
Statement type: