Skip to main content

ProcessFork plugin for vLLM ≥0.10 — paged-KV-cache snapshot/restore via batch-invariant kernels.

Project description

processfork-vllm

ProcessFork plugin for vLLM ≥0.10. Adds OpenAI-compatible extended endpoints for snapshot / fork / checkout that walk vLLM's paged KV cache via the batch-invariant kernel mode.

Install

pip install "processfork-vllm[vllm]"

Use

vllm serve meta-llama/Llama-3-8B \
  --enforce-deterministic \
  --plugin processfork

Then:

POST /v1/processfork/snapshot       { "name": "..." }
  → { "cid": "sha256:..." }

POST /v1/processfork/fork           { "cid": "...", "n": 12 }
  → { "cids": ["sha256:..."] }

POST /v1/processfork/checkout       { "cid": "..." }
  → { "ok": true }

Bit-exact restore requires --enforce-deterministic (stable since vLLM 0.10). Without it, restore produces logits within ≤1e-4 of the originals.

The wire format matches agent_docs/cache-layer.mdpaged-batchinvariant-v1. K and V pages are content-addressed independently so a fork that mutates only V (one-token decode) shares its K page with siblings.

Status

The trait surface and the paged-batchinvariant-v1 wire format are stable. The live FFI shim into vllm.worker.cache_engine lands in v1.0.1. Until then, the plugin's HTTP surface returns 501 Not Implemented with a clear pointer to this README.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

processfork_vllm-1.0.2.tar.gz (8.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

processfork_vllm-1.0.2-py3-none-any.whl (7.6 kB view details)

Uploaded Python 3

File details

Details for the file processfork_vllm-1.0.2.tar.gz.

File metadata

  • Download URL: processfork_vllm-1.0.2.tar.gz
  • Upload date:
  • Size: 8.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for processfork_vllm-1.0.2.tar.gz
Algorithm Hash digest
SHA256 23cda64f057eb2dc68f997dd6fc0080c9aaeb61d63af45b13247abae2ae4fbef
MD5 7393aa4550108df354a39974d1bb8845
BLAKE2b-256 68d3f3a9bd3cbb0047d716f0ff139f4cfbf1988c123b6fc3029b4ae72968a9ba

See more details on using hashes here.

Provenance

The following attestation bundles were made for processfork_vllm-1.0.2.tar.gz:

Publisher: release.yml on manav8498/processfork

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file processfork_vllm-1.0.2-py3-none-any.whl.

File metadata

File hashes

Hashes for processfork_vllm-1.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 2e779600ba467035e889799009d7ec6129e9eaf738fc49306f15ebfe3e88ab7c
MD5 1e4fce519d67d1323f8bf2fd0b17442f
BLAKE2b-256 683f2da3a7652660ac6b21d324297192c770fbcae3b79fbc3022acd28eb9b685

See more details on using hashes here.

Provenance

The following attestation bundles were made for processfork_vllm-1.0.2-py3-none-any.whl:

Publisher: release.yml on manav8498/processfork

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page