SGLang multiplexer with an OpenAI-compatible frontend

These details have not been verified by PyPI

Project description

sglangmux

sglangmux is a lightweight Rust multiplexer for running multiple SGLang model servers behind one OpenAI-compatible frontend.

It provides:

one frontend endpoint for chat/completions
automatic model activation/switching based on the request model
OpenAI-style /models and /v1/models listing
per-model process management with per-model stdout/stderr logs

Repository Layout

src/lib.rs: core multiplexer library (SgLangMux)
src/bin/sglangmuxd.rs: HTTP daemon frontend
examples/sglangmux-manual/: manual verification scripts for two models
tests/: integration tests

How It Works

You provide one launch script per model.
Each script must include:
- model identifier via either MODEL_NAME=<openai-model-id> or launch arg --model <openai-model-id> (or --model-path <openai-model-id>)
- local port via either PORT=<local-port> or launch arg --port <local-port>
sglangmuxd starts models (bootstrap), tracks active model state, and forwards requests to the correct upstream model server.
When the requested model differs from active model, the mux switches by pausing/sleeping current model and waking target model.

Requirements

Rust toolchain (for cargo run)
Python environment with sglang installed for your model launch scripts
GPU/runtime support required by your chosen SGLang models

Python Install (uv / pip)

The project ships a Python CLI wrapper that executes the Rust daemon binary.

After publishing to PyPI, usage is:

uv pip install sglangmux
sglangmux --help

For local install from this repository:

uv pip install .
sglangmux --help

Notes:

The wheel build runs cargo build --release --bin sglangmuxd.
Installing from source requires a working Rust toolchain.
The installed command is sglangmux, which forwards all args to sglangmuxd.

Quick Start

1. Prepare Python env for model scripts

uv venv --python /usr/bin/python3.10 .venv
uv pip install --python .venv/bin/python sglang

2. Start mux with example scripts

./examples/sglangmux-manual/start_sglangmux.sh

3. Send requests

./examples/sglangmux-manual/request_models.sh
./examples/sglangmux-manual/request_qwen.sh
./examples/sglangmux-manual/request_hf.sh

See examples/sglangmux-manual/README.md for detailed manual workflow.

Running `sglangmuxd` Directly

cargo run --bin sglangmuxd -- \
  --host 127.0.0.1 \
  --listen-port 30100 \
  --upstream-timeout-secs 120 \
  --model-ready-timeout-secs 120 \
  --model-switch-timeout-secs 60 \
  --log-dir sglangmux-logs \
  /path/to/model1.sh /path/to/model2.sh

CLI Options

--host: bind host for frontend daemon (default 127.0.0.1)
--listen-port: bind port for frontend daemon (default 30100)
--upstream-timeout-secs: timeout waiting for upstream model response (default 120)
--model-ready-timeout-secs: timeout while waiting for model process to become healthy (default 120)
--model-switch-timeout-secs: timeout waiting for model activation/switch for a pending request (default 60)
--log-dir: directory for per-model logs (default sglangmux-logs)

To expose externally:

--host 0.0.0.0

Frontend API

Implemented routes:

GET /health
GET /models
GET /v1/models
POST /v1/chat/completions
POST /v1/completions

Notes:

Requests must include a string model field.
For streaming (stream: true / SSE), sglangmuxd forwards the streaming payload through.

Model Launch Script Contract

Each script passed to sglangmuxd must define a model id and local port. The model id can come from MODEL_NAME or launch flags --model / --model-path, and the local port can come from PORT or launch flag --port:

MODEL_NAME="Qwen/Qwen3-0.6B"
PORT=30001

The daemon parses these values from script text and uses them to build model registry and routing map.

Timeouts and Failure Modes

upstream-timeout-secs: model server did not respond in time for completion request (returns 504)
model-ready-timeout-secs: model process did not become healthy during startup/bring-up
model-switch-timeout-secs: request waited too long for requested model to become active

Common frontend errors:

model not ready: ...: switch/startup issue
upstream request timed out: generation took longer than upstream timeout
invalid upstream response: upstream returned non-JSON where JSON expected (non-stream path)

Logging

Rust log filter is controlled by RUST_LOG.

Examples:

RUST_LOG=info ./examples/sglangmux-manual/start_sglangmux.sh
RUST_LOG=sglangmux=info,sglangmuxd=info,warn ./examples/sglangmux-manual/start_sglangmux.sh

Per-model stdout/stderr log files are written under --log-dir.

Graceful Shutdown

sglangmuxd listens for Ctrl+C and triggers model shutdown via mux cleanup logic before exit.

Development

Build:

cargo check --bin sglangmuxd

Test:

cargo test

Publishing to PyPI

Use the helper script:

scripts/publish_pypi.sh

Upload to TestPyPI:

scripts/publish_pypi.sh --testpypi

The script:

builds sdist + wheel (python -m build)
runs twine check on artifacts
uploads via twine upload

Project details

These details have not been verified by PyPI

Environment
- Console
Operating System
- OS Independent
Programming Language
- Python :: 3
- Rust
Topic
- Software Development :: Libraries

Release history Release notifications | RSS feed

This version

0.1.1

Feb 23, 2026

0.1.0

Feb 22, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sglangmux-0.1.1.tar.gz (33.6 kB view details)

Uploaded Feb 23, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

sglangmux-0.1.1-py3-none-any.whl (3.2 MB view details)

Uploaded Feb 23, 2026 Python 3

File details

Details for the file sglangmux-0.1.1.tar.gz.

File metadata

Download URL: sglangmux-0.1.1.tar.gz
Upload date: Feb 23, 2026
Size: 33.6 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.8.22

File hashes

Hashes for sglangmux-0.1.1.tar.gz
Algorithm	Hash digest
SHA256	`7f7a16d1f7b57e35693a43ac0061f882e8d1f4e1b94916f51d44af1d4ff511b4`
MD5	`5ad8b9c3fe2a214e742b99bb8a8db28a`
BLAKE2b-256	`d443afe25bc571cd94500fab01a4a7d7e177e3a9932d6d4acc29d86701b18b89`

See more details on using hashes here.

File details

Details for the file sglangmux-0.1.1-py3-none-any.whl.

File metadata

Download URL: sglangmux-0.1.1-py3-none-any.whl
Upload date: Feb 23, 2026
Size: 3.2 MB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.8.22

File hashes

Hashes for sglangmux-0.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`36abfb5000754ee30f4a1df5cf97ba1bbba768ba827668b51b05042e93a29cb5`
MD5	`d72d2a437c90bc4a1c96b758b090cc03`
BLAKE2b-256	`bc6e70ba9c5a4de3f12797c0a02ce484f01d94979d5d3e9d5ae854703ff1f3ae`

See more details on using hashes here.

sglangmux 0.1.1

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

sglangmux

Repository Layout

How It Works

Requirements

Python Install (uv / pip)

Quick Start

1. Prepare Python env for model scripts

2. Start mux with example scripts

3. Send requests

Running sglangmuxd Directly

CLI Options

Frontend API

Model Launch Script Contract

Timeouts and Failure Modes

Logging

Graceful Shutdown

Development

Publishing to PyPI

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

Running `sglangmuxd` Directly