Skip to main content

High-performance key-value storage engine with Python bindings

Project description

PegaFlow Python Package

High-performance key-value storage engine with Python bindings, built with Rust and PyO3.

Features

  • PegaEngine: Fast Rust-based key-value storage with Python bindings
  • PegaKVConnector: vLLM KV connector for distributed inference with KV cache transfer

Installation

From Source

# Install maturin if you haven't already
pip install maturin

# Build and install in development mode
cd python
maturin develop

# Or build a wheel
maturin build --release

From PyPI (coming soon)

pip install pegaflow

Usage

Basic KV Storage

from pegaflow import PegaEngine

# Create a new engine
engine = PegaEngine()

# Store key-value pairs
engine.put("name", "PegaFlow")
engine.put("version", "0.1.0")

# Retrieve values
name = engine.get("name")  # Returns "PegaFlow"
missing = engine.get("nonexistent")  # Returns None

# Remove keys
removed = engine.remove("name")  # Returns "PegaFlow"

Sglang Examples:

example 1:

python3 -m sglang.launch_server --model-path Qwen/Qwen3-0.6B --served-model-name Qwen/Qwen3-0.6B --trust-remote-code --enable-cache-report --page-size 256 --host "0.0.0.0" --port 8000 --mem-fraction-static 0.8 --max-running-requests 32 --enable-pegaflow

example 2:

python3 -m sglang.launch_server --model-loader-extra-config "{\"enable_multithread_load\": true, \"num_threads\": 64}"  --model-path deepseek-ai/DeepSeek-V3.2 --served-model-name deepseek-ai/DeepSeek-V3.2 --trust-remote-code --page-size "64" --reasoning-parser deepseek-v3 --tool-call-parser deepseekv32 --enable-cache-report --host "0.0.0.0" --port 8031 --mem-fraction-static 0.83 --max-running-requests 64 --tp-size "8" --enable-pegaflow

vLLM KV Connector

from vllm import LLM
from vllm.distributed.kv_transfer.kv_transfer_agent import KVTransferConfig

# Configure vLLM to use PegaKVConnector
kv_transfer_config = KVTransferConfig(
    kv_connector="PegaKVConnector",
    kv_role="kv_both",
    kv_connector_module_path="pegaflow.connector",
)

# Create LLM with KV transfer enabled
llm = LLM(
    model="gpt2",
    kv_transfer_config=kv_transfer_config,
)

Development

See the examples directory for more usage examples.

Testing

Running Unit Tests

The test suite includes integration tests that verify the EngineRpcClient can correctly communicate with a running pegaflow-server instance.

Prerequisites

  1. Build the Rust extension:

    cd python
    maturin develop --release
    
  2. Build the server binary:

    cd ..
    cargo build --release --bin pegaflow-server
    
  3. Ensure CUDA is available (tests require GPU):

    python -c "import torch; assert torch.cuda.is_available()"
    

Running Tests

cd python

# Run all tests
pytest tests/ -v

# Run specific test file
pytest tests/test_engine_client.py -v

# Run with coverage
pytest tests/ --cov=pegaflow --cov-report=html

Test Structure

  • tests/conftest.py: Contains pytest fixtures for:

    • pega_server: Automatically starts/stops pegaflow-server for integration tests
    • engine_client: Creates an EngineRpcClient connected to the test server
    • client_context: Provides a ClientContext representing a vLLM instance with GPU KV cache tensors
    • registered_instance: Provides a registered instance ID for query tests
  • tests/test_engine_client.py: Integration tests for:

    • Server connectivity
    • Query operations with various inputs

Test Fixtures

The ClientContext class abstracts a vLLM instance and provides:

  • register_kv_caches(): Register GPU KV cache tensors with the server
  • query(block_hashes): Query available blocks
  • unregister_context(): Unregister context from server

Example test usage:

def test_query(client_context):
    """Test query operation."""
    result = client_context.query([])
    assert result is not None

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

pegaflow_llm_cu13-0.22.1-cp314-cp314-manylinux_2_34_x86_64.whl (8.5 MB view details)

Uploaded CPython 3.14manylinux: glibc 2.34+ x86-64

pegaflow_llm_cu13-0.22.1-cp313-cp313-manylinux_2_34_x86_64.whl (8.5 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.34+ x86-64

pegaflow_llm_cu13-0.22.1-cp312-cp312-manylinux_2_34_x86_64.whl (8.5 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.34+ x86-64

pegaflow_llm_cu13-0.22.1-cp311-cp311-manylinux_2_34_x86_64.whl (8.6 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.34+ x86-64

pegaflow_llm_cu13-0.22.1-cp310-cp310-manylinux_2_34_x86_64.whl (8.6 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.34+ x86-64

File details

Details for the file pegaflow_llm_cu13-0.22.1-cp314-cp314-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for pegaflow_llm_cu13-0.22.1-cp314-cp314-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 d6aac23d47ce1f32d2b2f781635471f8eeb76b6ca38b43af3b75b568d936adce
MD5 7cda688b443a061b6ddd0a1e6b5978e3
BLAKE2b-256 eb4b0c9e98bc7867845bb5d9e63c703d1dd59a426d26bdd32e14f1dd328178d9

See more details on using hashes here.

File details

Details for the file pegaflow_llm_cu13-0.22.1-cp313-cp313-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for pegaflow_llm_cu13-0.22.1-cp313-cp313-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 6f6dfc5a9495630845c11590cd0587aab60a80c158669773dbd4813a14760f06
MD5 0fb715881512c0660b08dda541e82f67
BLAKE2b-256 e1d5571f351347e7a147b565ef40f4342cd7fc55c0211b525290c587bbf49f0c

See more details on using hashes here.

File details

Details for the file pegaflow_llm_cu13-0.22.1-cp312-cp312-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for pegaflow_llm_cu13-0.22.1-cp312-cp312-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 6fcbb96b87faf5a9de5eb383292cab524837d36012324d6402b8fb144a96cfc4
MD5 9fa82b934b9046863993349e61bd6c57
BLAKE2b-256 87a9537917b5953730966b14d7e9ed643cba2d5c918e653201c58cefcfe524f1

See more details on using hashes here.

File details

Details for the file pegaflow_llm_cu13-0.22.1-cp311-cp311-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for pegaflow_llm_cu13-0.22.1-cp311-cp311-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 fd4ae3c2f1b87fac85fb0eed0d09e11f34e142e733bd28ce42363bf83fda95cd
MD5 05a6c1b4dce5270d383a4474c1070056
BLAKE2b-256 6205685e99035249f0e841f0d09057ac4ab826431718002b7f60f730a4245986

See more details on using hashes here.

File details

Details for the file pegaflow_llm_cu13-0.22.1-cp310-cp310-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for pegaflow_llm_cu13-0.22.1-cp310-cp310-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 87a5afa9c9ce5e64c720fcd7ce6f3f09d6297e2df4c090c3efa342a9c87d55e1
MD5 c2381a8974225d68c0c2f4e04a095961
BLAKE2b-256 a546f29e9de73648949b5645c3e89ee8f2ce8d67d8f58c634d7db5d2d82a8690

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page