Out-of-tree vLLM KVConnector for SemBlend semantic KV donor discovery

These details have not been verified by PyPI

Project links

Project description

SemBlend vLLM Connector

vLLM KVConnector for SemBlend-backed semantic KV donor discovery.

This repo is the open-source adapter layer between vLLM and SemBlend.

SemBlend is a semantic KV reuse research library. It exists to evaluate when similar prompts may safely reuse or blend previously computed KV state. This connector exposes that work through vLLM's KVConnectorBase_V1 lifecycle.

Status

Experimental.

Default behavior is discovery-only:

exact vLLM prefix caching remains authoritative;
semantic lookup runs only after exact prefix coverage is insufficient;
the connector records donor hits, misses, and rejection reasons;
it returns (0, False) from get_num_new_matched_tokens() unless a future materialization mode can prove that the KV can be loaded safely;
normal vLLM execution continues on every provider error or unsupported case.

Install

From PyPI:

pip install "semblend-vllm-connector[semblend]"

Development:

pip install -e ".[semblend,dev]"

Run local checks:

make check

vLLM Configuration

Discovery-only mode:

vllm serve meta-llama/Llama-3.1-8B-Instruct \
  --enable-prefix-caching \
  --kv-transfer-config '{
    "kv_connector": "SemBlendVllmConnector",
    "kv_connector_module_path": "semblend_vllm_connector.connector",
    "kv_role": "kv_both",
    "kv_load_failure_policy": "recompute",
    "kv_connector_extra_config": {
      "mode": "discovery_only",
      "provider": "local",
      "min_prompt_tokens": 256,
      "min_similarity": 0.70
    }
  }'

SemBlend provider mode:

{
  "kv_connector": "SemBlendVllmConnector",
  "kv_connector_module_path": "semblend_vllm_connector.connector",
  "kv_role": "kv_both",
  "kv_load_failure_policy": "recompute",
  "kv_connector_extra_config": {
    "mode": "discovery_only",
    "provider": "semblend",
    "min_prompt_tokens": 256,
    "min_similarity": 0.70,
    "min_reuse_ratio": 0.50,
    "embedder_type": "minilm",
    "model_id": "meta-llama/Llama-3.1-8B-Instruct"
  }
}

Equivalent JSON examples live in examples/.

Modes

Mode	Positive matched tokens?	Purpose
`discovery_only`	No	Safe telemetry and workload qualification.
`exact_prefix`	Only with engine-valid exact block refs	Future safe materialization path.
`request_only_experimental`	Yes, block-aligned prefix only	Isolated validation mode; run with vLLM prefix caching disabled.
`segmented_experimental`	Not enabled in this repo yet	Requires segmented/sparse execution support.

Safety Rules

The connector must not:

weaken exact prefix-cache semantics;
report semantic hits as computed tokens unless KV can actually be loaded;
publish non-identical semantic donor KV into vLLM's exact prefix cache;
cross model, tokenizer, adapter, or cache-salt namespaces;
fail inference because semantic lookup failed.

Repository Layout

src/semblend_vllm_connector/
  connector.py        vLLM KVConnectorBase_V1 implementation
  config.py           config/env parsing
  provider.py         provider protocol + local deterministic provider
  providers/
    semblend.py       lazy SemBlendPipeline adapter
  types.py            shared dataclasses/enums
  namespace.py        vLLM request namespace extraction

docs/
  ARCHITECTURE.md     detailed architecture and rollout plan
  SEMBLEND_PROVIDER.md
  VLLM_CONNECTOR_CONTRACT.md

examples/
  discovery_kv_transfer_config.json
  semblend_discovery_kv_transfer_config.json

Open Source Posture

This project follows the dynamic connector pattern used by mature vLLM KV cache projects: vLLM loads the connector from a Python module path, connector-specific settings live in kv_connector_extra_config, and unsafe materialization cases fail closed to normal vLLM prefill.

See:

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.0

Jun 5, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

semblend_vllm_connector-0.1.0.tar.gz (25.4 kB view details)

Uploaded Jun 5, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

semblend_vllm_connector-0.1.0-py3-none-any.whl (20.3 kB view details)

Uploaded Jun 5, 2026 Python 3

File details

Details for the file semblend_vllm_connector-0.1.0.tar.gz.

File metadata

Download URL: semblend_vllm_connector-0.1.0.tar.gz
Upload date: Jun 5, 2026
Size: 25.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for semblend_vllm_connector-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`2fbc69d401ab84442fc6c655bfb603e477af825e0bcf92b201aeb055e6030877`
MD5	`13618d67d34d814adee848e356407a70`
BLAKE2b-256	`3e6a7ca343a869a4556c26b460b38087dee0d161df4f25068b1900e609fc3ad9`

See more details on using hashes here.

File details

Details for the file semblend_vllm_connector-0.1.0-py3-none-any.whl.

File metadata

Download URL: semblend_vllm_connector-0.1.0-py3-none-any.whl
Upload date: Jun 5, 2026
Size: 20.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for semblend_vllm_connector-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`3d5950056df8044a1a1fe7ec53570c87033aaa78d1360e8e6c2173eceadeb536`
MD5	`23e1d1984b9abcc98b00522b6ada675e`
BLAKE2b-256	`6c28491ef7af4e3699a6e39bb9d1f28548e084e08f26ab9cb6c94b60f9c22af9`

See more details on using hashes here.

semblend-vllm-connector 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

SemBlend vLLM Connector

Status

Install

vLLM Configuration

Modes

Safety Rules

Repository Layout

Open Source Posture

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes