Skip to main content

sglang HiCacheStorage backend for the MemKV context memory store

Project description

memkv-sglang

sglang HiCacheStorage backend that persists prefix KV pages in a remote MemKV cluster. Loaded as a vendor plugin via sglang's built-in dynamic storage backend dispatch — no patches to sglang's tree.

Build

cd sglang-plugin
pip install maturin
maturin develop --release      # local dev install
# or
maturin build --release        # wheel under target/wheels/
pip install target/wheels/memkv_sglang-*.whl

The wheel bundles a native PyO3 extension built from the same memkv-client crate the NIXL plugin uses, so RDMA/TCP transport selection works the same way.

Configure the MemKV connection

The plugin reads the standard MemKV config chain — MEMKV_CONFIG yaml first, then MEMKV_* env vars:

export MEMKV_SERVERS="10.0.0.10:9900,10.0.0.11:9900"
export MEMKV_RDMA_DEVICES="mlx5_0,mlx5_1"
export MEMKV_AUTH_KEY="<64-hex>"
# optional:
# export MEMKV_TRANSPORT=auto       # rdma | tcp | auto (default)
# export MEMKV_CONFIG=/etc/memkv.yaml

Launch sglang against MemKV

Use sglang's dynamic storage backend, pointing it at the class in this package:

python -m sglang.launch_server \
    --model-path meta-llama/Llama-3-8B \
    --enable-hierarchical-cache \
    --hicache-storage-backend dynamic \
    --hicache-storage-backend-extra-config '{
      "backend_name": "memkv",
      "module_path": "memkv_sglang.backend",
      "class_name": "MemKVHiCacheStorage"
    }'

sglang's StorageBackendFactory._create_dynamic_backend imports the class and constructs it as MemKVHiCacheStorage(storage_config, kwargs).

What's implemented

Method Status
get / batch_get yes; RDMA zero-copy direct into the target tensor when eligible (Linux + CPU + contiguous), bytes path otherwise
set / batch_set yes
exists / batch_exists yes (router-aware, one batched RPC per server)
batch_exists_v2 yes (per-pool hit policies)
batch_get_v2 yes; RDMA zero-copy into the dummy flat page when eligible
batch_set_v2 yes
clear no-op (server manages retention)

Layout

sglang-plugin/
├── Cargo.toml                   # cdylib + pyo3 + memkv-client
├── pyproject.toml               # maturin
├── src/lib.rs                   # PyO3 wrapper around memkv-client::Engine
└── python/memkv_sglang/
    ├── __init__.py              # re-exports Client
    └── backend.py               # MemKVHiCacheStorage(HiCacheStorage)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

memkv_sglang-1.0.0-cp38-abi3-manylinux_2_28_x86_64.whl (1.5 MB view details)

Uploaded CPython 3.8+manylinux: glibc 2.28+ x86-64

memkv_sglang-1.0.0-cp38-abi3-manylinux_2_28_aarch64.whl (1.4 MB view details)

Uploaded CPython 3.8+manylinux: glibc 2.28+ ARM64

File details

Details for the file memkv_sglang-1.0.0-cp38-abi3-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for memkv_sglang-1.0.0-cp38-abi3-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 f1495c926c77c4a080e4bafed5d703a32cb2133f77ed71b24aaa0662450f1ae1
MD5 13f6743505fbc799938e2b0bf9e773bc
BLAKE2b-256 57b9610af047849593c4451604193a509a7b2aac20d45275a0b39442f847142f

See more details on using hashes here.

File details

Details for the file memkv_sglang-1.0.0-cp38-abi3-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for memkv_sglang-1.0.0-cp38-abi3-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 b63437a3ecde1108d14414f538ab7ae443be3bdbc5d3c566d316e003fbfc21c8
MD5 2eb97efb0f3cfd26edca25a89f905bb2
BLAKE2b-256 cafbdbfaa13b46af20013e19d70f36e370d8cf7c0c11b7aa2a9cff6d94cbb2e3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page