Skip to main content

sglang HiCacheStorage backend for the MemKV context memory store

Project description

memkv-sglang

sglang HiCacheStorage backend that persists prefix KV pages in a remote MemKV cluster. Loaded as a vendor plugin via sglang's built-in dynamic storage backend dispatch — no patches to sglang's tree.

Build

cd sglang-plugin
pip install maturin
maturin develop --release      # local dev install
# or
maturin build --release        # wheel under target/wheels/
pip install target/wheels/memkv_sglang-*.whl

The wheel bundles a native PyO3 extension built from the same memkv-client crate the NIXL plugin uses, so RDMA/TCP transport selection works the same way.

Configure the MemKV connection

The plugin reads the standard MemKV config chain — MEMKV_CONFIG yaml first, then MEMKV_* env vars:

export MEMKV_SERVERS="10.0.0.10:9900,10.0.0.11:9900"
export MEMKV_RDMA_DEVICES="mlx5_0,mlx5_1"
export MEMKV_AUTH_KEY="<64-hex>"
# optional:
# export MEMKV_TRANSPORT=auto       # rdma | tcp | auto (default)
# export MEMKV_CONFIG=/etc/memkv.yaml

Launch sglang against MemKV

Use sglang's dynamic storage backend, pointing it at the class in this package:

python -m sglang.launch_server \
    --model-path meta-llama/Llama-3-8B \
    --enable-hierarchical-cache \
    --hicache-storage-backend dynamic \
    --hicache-storage-backend-extra-config '{
      "backend_name": "memkv",
      "module_path": "memkv_sglang.backend",
      "class_name": "MemKVHiCacheStorage"
    }'

sglang's StorageBackendFactory._create_dynamic_backend imports the class and constructs it as MemKVHiCacheStorage(storage_config, kwargs).

What's implemented

Method Status
get / batch_get yes; RDMA zero-copy direct into the target tensor when eligible (Linux + CPU + contiguous), bytes path otherwise
set / batch_set yes
exists / batch_exists yes (router-aware, one batched RPC per server)
batch_exists_v2 yes (per-pool hit policies)
batch_get_v2 yes; RDMA zero-copy into the dummy flat page when eligible
batch_set_v2 yes
clear no-op (server manages retention)

Layout

sglang-plugin/
├── Cargo.toml                   # cdylib + pyo3 + memkv-client
├── pyproject.toml               # maturin
├── src/lib.rs                   # PyO3 wrapper around memkv-client::Engine
└── python/memkv_sglang/
    ├── __init__.py              # re-exports Client
    └── backend.py               # MemKVHiCacheStorage(HiCacheStorage)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

memkv_sglang-1.0.1-cp38-abi3-manylinux_2_28_x86_64.whl (1.5 MB view details)

Uploaded CPython 3.8+manylinux: glibc 2.28+ x86-64

memkv_sglang-1.0.1-cp38-abi3-manylinux_2_28_aarch64.whl (1.4 MB view details)

Uploaded CPython 3.8+manylinux: glibc 2.28+ ARM64

File details

Details for the file memkv_sglang-1.0.1-cp38-abi3-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for memkv_sglang-1.0.1-cp38-abi3-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 73c8ac960e72daffac08a9c11474bb4fc10553fea8340ac8b964918d183d0918
MD5 9a4504328f6b37a12065391feaf0461b
BLAKE2b-256 5667310a353b896785dbf8fd920fd6445a12ffad6a915892adf0de05175e80c6

See more details on using hashes here.

File details

Details for the file memkv_sglang-1.0.1-cp38-abi3-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for memkv_sglang-1.0.1-cp38-abi3-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 99a4dd1cd19c8f4ddb84c021d0b2f5ca650e341f273169c2a6948c3e862c106d
MD5 052a78dbbadf28f35d47bfc91e17327a
BLAKE2b-256 1d51545edd4016c82193fac3435bdc03ac1dc24a92e3447c27552e41fed7944b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page