sglang HiCacheStorage backend for the MemKV context memory store
Project description
memkv-sglang
sglang HiCacheStorage backend that persists prefix KV pages in a
remote MemKV cluster. Loaded as a vendor plugin via sglang's built-in
dynamic storage backend dispatch — no patches to sglang's tree.
Build
cd sglang-plugin
pip install maturin
maturin develop --release # local dev install
# or
maturin build --release # wheel under target/wheels/
pip install target/wheels/memkv_sglang-*.whl
The wheel bundles a native PyO3 extension built from the same
memkv-client crate the NIXL plugin uses, so RDMA/TCP transport
selection works the same way.
Configure the MemKV connection
The plugin reads the standard MemKV config chain — MEMKV_CONFIG
yaml first, then MEMKV_* env vars:
export MEMKV_SERVERS="10.0.0.10:9900,10.0.0.11:9900"
export MEMKV_RDMA_DEVICES="mlx5_0,mlx5_1"
export MEMKV_AUTH_KEY="<64-hex>"
# optional:
# export MEMKV_TRANSPORT=auto # rdma | tcp | auto (default)
# export MEMKV_CONFIG=/etc/memkv.yaml
Launch sglang against MemKV
Use sglang's dynamic storage backend, pointing it at the class in
this package:
python -m sglang.launch_server \
--model-path meta-llama/Llama-3-8B \
--enable-hierarchical-cache \
--hicache-storage-backend dynamic \
--hicache-storage-backend-extra-config '{
"backend_name": "memkv",
"module_path": "memkv_sglang.backend",
"class_name": "MemKVHiCacheStorage"
}'
sglang's StorageBackendFactory._create_dynamic_backend imports the
class and constructs it as MemKVHiCacheStorage(storage_config, kwargs).
What's implemented
| Method | Status |
|---|---|
get / batch_get |
yes; RDMA zero-copy direct into the target tensor when eligible (Linux + CPU + contiguous), bytes path otherwise |
set / batch_set |
yes |
exists / batch_exists |
yes (router-aware, one batched RPC per server) |
batch_exists_v2 |
yes (per-pool hit policies) |
batch_get_v2 |
yes; RDMA zero-copy into the dummy flat page when eligible |
batch_set_v2 |
yes |
clear |
no-op (server manages retention) |
Layout
sglang-plugin/
├── Cargo.toml # cdylib + pyo3 + memkv-client
├── pyproject.toml # maturin
├── src/lib.rs # PyO3 wrapper around memkv-client::Engine
└── python/memkv_sglang/
├── __init__.py # re-exports Client
└── backend.py # MemKVHiCacheStorage(HiCacheStorage)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file memkv_sglang-1.0.0-cp38-abi3-manylinux_2_28_x86_64.whl.
File metadata
- Download URL: memkv_sglang-1.0.0-cp38-abi3-manylinux_2_28_x86_64.whl
- Upload date:
- Size: 1.5 MB
- Tags: CPython 3.8+, manylinux: glibc 2.28+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.20
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f1495c926c77c4a080e4bafed5d703a32cb2133f77ed71b24aaa0662450f1ae1
|
|
| MD5 |
13f6743505fbc799938e2b0bf9e773bc
|
|
| BLAKE2b-256 |
57b9610af047849593c4451604193a509a7b2aac20d45275a0b39442f847142f
|
File details
Details for the file memkv_sglang-1.0.0-cp38-abi3-manylinux_2_28_aarch64.whl.
File metadata
- Download URL: memkv_sglang-1.0.0-cp38-abi3-manylinux_2_28_aarch64.whl
- Upload date:
- Size: 1.4 MB
- Tags: CPython 3.8+, manylinux: glibc 2.28+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.20
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b63437a3ecde1108d14414f538ab7ae443be3bdbc5d3c566d316e003fbfc21c8
|
|
| MD5 |
2eb97efb0f3cfd26edca25a89f905bb2
|
|
| BLAKE2b-256 |
cafbdbfaa13b46af20013e19d70f36e370d8cf7c0c11b7aa2a9cff6d94cbb2e3
|