High-performance key-value storage engine with Python bindings
Project description
PegaFlow Python Package
High-performance key-value storage engine with Python bindings, built with Rust and PyO3.
Features
- PegaEngine: Fast Rust-based key-value storage with Python bindings
- PegaKVConnector: vLLM KV connector for distributed inference with KV cache transfer
Installation
From Source
# Install maturin if you haven't already
pip install maturin
# Build and install in development mode
cd python
maturin develop
# Or build a wheel
maturin build --release
From PyPI (coming soon)
pip install pegaflow
Usage
Basic KV Storage
from pegaflow import PegaEngine
# Create a new engine
engine = PegaEngine()
# Store key-value pairs
engine.put("name", "PegaFlow")
engine.put("version", "0.1.0")
# Retrieve values
name = engine.get("name") # Returns "PegaFlow"
missing = engine.get("nonexistent") # Returns None
# Remove keys
removed = engine.remove("name") # Returns "PegaFlow"
Sglang Examples:
example 1:
python3 -m sglang.launch_server --model-path Qwen/Qwen3-0.6B --served-model-name Qwen/Qwen3-0.6B --trust-remote-code --enable-cache-report --page-size 256 --host "0.0.0.0" --port 8000 --mem-fraction-static 0.8 --max-running-requests 32 --enable-pegaflow
example 2:
python3 -m sglang.launch_server --model-loader-extra-config "{\"enable_multithread_load\": true, \"num_threads\": 64}" --model-path deepseek-ai/DeepSeek-V3.2 --served-model-name deepseek-ai/DeepSeek-V3.2 --trust-remote-code --page-size "64" --reasoning-parser deepseek-v3 --tool-call-parser deepseekv32 --enable-cache-report --host "0.0.0.0" --port 8031 --mem-fraction-static 0.83 --max-running-requests 64 --tp-size "8" --enable-pegaflow
vLLM KV Connector
from vllm import LLM
from vllm.distributed.kv_transfer.kv_transfer_agent import KVTransferConfig
# Configure vLLM to use PegaKVConnector
kv_transfer_config = KVTransferConfig(
kv_connector="PegaKVConnector",
kv_role="kv_both",
kv_connector_module_path="pegaflow.connector",
)
# Create LLM with KV transfer enabled
llm = LLM(
model="gpt2",
kv_transfer_config=kv_transfer_config,
)
Development
See the examples directory for more usage examples.
Testing
Running Unit Tests
The test suite includes integration tests that verify the EngineRpcClient can correctly communicate with a running pegaflow-server instance.
Prerequisites
-
Build the Rust extension:
cd python maturin develop --release
-
Build the server binary:
cd .. cargo build --release --bin pegaflow-server
-
Ensure CUDA is available (tests require GPU):
python -c "import torch; assert torch.cuda.is_available()"
Running Tests
cd python
# Run all tests
pytest tests/ -v
# Run specific test file
pytest tests/test_engine_client.py -v
# Run with coverage
pytest tests/ --cov=pegaflow --cov-report=html
Test Structure
-
tests/conftest.py: Contains pytest fixtures for:pega_server: Automatically starts/stopspegaflow-serverfor integration testsengine_client: Creates anEngineRpcClientconnected to the test serverclient_context: Provides aClientContextrepresenting a vLLM instance with GPU KV cache tensorsregistered_instance: Provides a registered instance ID for query tests
-
tests/test_engine_client.py: Integration tests for:- Server connectivity
- Query operations with various inputs
Test Fixtures
The ClientContext class abstracts a vLLM instance and provides:
register_kv_caches(): Register GPU KV cache tensors with the serverquery(block_hashes): Query available blocksunregister_context(): Unregister context from server
Example test usage:
def test_query(client_context):
"""Test query operation."""
result = client_context.query([])
assert result is not None
License
MIT
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pegaflow_llm_cu13-0.20.0-cp313-cp313-manylinux_2_34_x86_64.whl.
File metadata
- Download URL: pegaflow_llm_cu13-0.20.0-cp313-cp313-manylinux_2_34_x86_64.whl
- Upload date:
- Size: 8.5 MB
- Tags: CPython 3.13, manylinux: glibc 2.34+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a10b6e4a46159dac69c1aad3a8afddd7ef9a6e1fbbafca8afa7e0c06a3783e26
|
|
| MD5 |
d5c97225935331e722e39186d558134f
|
|
| BLAKE2b-256 |
19854022250f26c96396a46e62764ee5419c3f3486ef78a1d366542b98d8f664
|
File details
Details for the file pegaflow_llm_cu13-0.20.0-cp312-cp312-manylinux_2_34_x86_64.whl.
File metadata
- Download URL: pegaflow_llm_cu13-0.20.0-cp312-cp312-manylinux_2_34_x86_64.whl
- Upload date:
- Size: 8.5 MB
- Tags: CPython 3.12, manylinux: glibc 2.34+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
276c62bc37c5a400b8b9ac376c654f4ed80d90b1390dd450df7d9d52051f07cd
|
|
| MD5 |
29977c0f1c087c82e757e555e33e362b
|
|
| BLAKE2b-256 |
ad095810c937b5e12acdcce732da370dfd604d807bc1d1a8d9bfada5224bed05
|
File details
Details for the file pegaflow_llm_cu13-0.20.0-cp311-cp311-manylinux_2_34_x86_64.whl.
File metadata
- Download URL: pegaflow_llm_cu13-0.20.0-cp311-cp311-manylinux_2_34_x86_64.whl
- Upload date:
- Size: 8.5 MB
- Tags: CPython 3.11, manylinux: glibc 2.34+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9cdcfe437a7b7b643b9fac597eb2cadcdb2eebe218d955b3d604527e8a5cf938
|
|
| MD5 |
1cabd6db06df16d79eaa8102527d2e3b
|
|
| BLAKE2b-256 |
7ad9ca722fa15411ad5e1a924507aeaf71aa321d0df655db853f90637b106756
|
File details
Details for the file pegaflow_llm_cu13-0.20.0-cp310-cp310-manylinux_2_34_x86_64.whl.
File metadata
- Download URL: pegaflow_llm_cu13-0.20.0-cp310-cp310-manylinux_2_34_x86_64.whl
- Upload date:
- Size: 8.5 MB
- Tags: CPython 3.10, manylinux: glibc 2.34+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8183b07589a93c3c8a1cbe57b4da96df48c27683373650612b16758153633c60
|
|
| MD5 |
eae10d227f27bf3e5c8c10b9a54d1960
|
|
| BLAKE2b-256 |
efd41d7b08c7a17f37012e06a36b29f1d7356503c42068c040568923c5f12549
|