Skip to main content

ProcessFork plugin for vLLM ≥0.10 — paged-KV-cache snapshot/restore via batch-invariant kernels.

Project description

processfork-vllm

ProcessFork plugin for vLLM ≥0.10. Adds OpenAI-compatible extended endpoints for snapshot / fork / checkout that walk vLLM's paged KV cache via the batch-invariant kernel mode.

Install

pip install "processfork-vllm[vllm]"

Use

vllm serve meta-llama/Llama-3-8B \
  --enforce-deterministic \
  --plugin processfork

Then:

POST /v1/processfork/snapshot       { "name": "..." }
  → { "cid": "sha256:..." }

POST /v1/processfork/fork           { "cid": "...", "n": 12 }
  → { "cids": ["sha256:..."] }

POST /v1/processfork/checkout       { "cid": "..." }
  → { "ok": true }

Bit-exact restore requires --enforce-deterministic (stable since vLLM 0.10). Without it, restore produces logits within ≤1e-4 of the originals.

The wire format matches agent_docs/cache-layer.mdpaged-batchinvariant-v1. K and V pages are content-addressed independently so a fork that mutates only V (one-token decode) shares its K page with siblings.

Status

The trait surface and the paged-batchinvariant-v1 wire format are stable. The live FFI shim into vllm.worker.cache_engine lands in v1.0.1. Until then, the plugin's HTTP surface returns 501 Not Implemented with a clear pointer to this README.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

processfork_vllm-1.0.3.tar.gz (9.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

processfork_vllm-1.0.3-py3-none-any.whl (8.1 kB view details)

Uploaded Python 3

File details

Details for the file processfork_vllm-1.0.3.tar.gz.

File metadata

  • Download URL: processfork_vllm-1.0.3.tar.gz
  • Upload date:
  • Size: 9.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for processfork_vllm-1.0.3.tar.gz
Algorithm Hash digest
SHA256 4254fbdca7e64a3609063f9e58b8eeeb18f0da2b7d3e434c868b91cd97f1ed8f
MD5 0c31d05030b57c4e4c3f3e8ef5e32323
BLAKE2b-256 d329bb8c95611e2451c776f9c45855c66da64a07f3a8d4bc3fb6162a0ec16dc7

See more details on using hashes here.

Provenance

The following attestation bundles were made for processfork_vllm-1.0.3.tar.gz:

Publisher: release.yml on manav8498/processfork

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file processfork_vllm-1.0.3-py3-none-any.whl.

File metadata

File hashes

Hashes for processfork_vllm-1.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 75c54bff77186167f4a5adf7d413d38baf6608824e6603baed2c6406c3c59c8f
MD5 7fe584b46baaf939c5cff0a8faa5e900
BLAKE2b-256 3cd361cbcdf55d2f73d2578670371d4cd578832fa76ad46c15e2670e29ace053

See more details on using hashes here.

Provenance

The following attestation bundles were made for processfork_vllm-1.0.3-py3-none-any.whl:

Publisher: release.yml on manav8498/processfork

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page