Skip to main content

ProcessFork plugin for vLLM ≥0.10 — paged-KV-cache snapshot/restore via batch-invariant kernels.

Project description

processfork-vllm

ProcessFork plugin for vLLM ≥0.10. Adds OpenAI-compatible extended endpoints for snapshot / fork / checkout that walk vLLM's paged KV cache via the batch-invariant kernel mode.

Install

pip install "processfork-vllm[vllm]"

Use

vllm serve meta-llama/Llama-3-8B \
  --enforce-deterministic \
  --plugin processfork

Then:

POST /v1/processfork/snapshot       { "name": "..." }
  → { "cid": "sha256:..." }

POST /v1/processfork/fork           { "cid": "...", "n": 12 }
  → { "cids": ["sha256:..."] }

POST /v1/processfork/checkout       { "cid": "..." }
  → { "ok": true }

Bit-exact restore requires --enforce-deterministic (stable since vLLM 0.10). Without it, restore produces logits within ≤1e-4 of the originals.

The wire format matches agent_docs/cache-layer.mdpaged-batchinvariant-v1. K and V pages are content-addressed independently so a fork that mutates only V (one-token decode) shares its K page with siblings.

Status

The trait surface and the paged-batchinvariant-v1 wire format are stable. The live FFI shim into vllm.worker.cache_engine lands in v1.0.1. Until then, the plugin's HTTP surface returns 501 Not Implemented with a clear pointer to this README.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

processfork_vllm-1.0.1.tar.gz (6.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

processfork_vllm-1.0.1-py3-none-any.whl (6.2 kB view details)

Uploaded Python 3

File details

Details for the file processfork_vllm-1.0.1.tar.gz.

File metadata

  • Download URL: processfork_vllm-1.0.1.tar.gz
  • Upload date:
  • Size: 6.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for processfork_vllm-1.0.1.tar.gz
Algorithm Hash digest
SHA256 b80274cb5882924e96d9cbcfc66bf01b390729ff03aa1f2b7404c8f0c5c1be3a
MD5 f52713d24cbb46f7f0ae96b5f3dc8a42
BLAKE2b-256 a26a0d27998d4fb40bcb4138b390a0e4de3c69aadcfbda8c572a4891e29e28f4

See more details on using hashes here.

Provenance

The following attestation bundles were made for processfork_vllm-1.0.1.tar.gz:

Publisher: release.yml on manav8498/processfork

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file processfork_vllm-1.0.1-py3-none-any.whl.

File metadata

File hashes

Hashes for processfork_vllm-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 3f2aa3b20f4e7955728dbd9532a18f4e1f847da247d52e61771825567a4e4f8e
MD5 e94869fcef7296b77d4f7b2e39d91c7c
BLAKE2b-256 d8aa5bbd6978c45b82f30f12137f34ca7b16ec8849cbb9162e8675c4f05b19ab

See more details on using hashes here.

Provenance

The following attestation bundles were made for processfork_vllm-1.0.1-py3-none-any.whl:

Publisher: release.yml on manav8498/processfork

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page