Skip to main content

High-performance key-value storage engine with Python bindings

Project description

PegaFlow Python Package

High-performance key-value storage engine with Python bindings, built with Rust and PyO3.

Features

  • PegaEngine: Fast Rust-based key-value storage with Python bindings
  • PegaKVConnector: vLLM KV connector for distributed inference with KV cache transfer

Installation

From Source

# Install maturin if you haven't already
pip install maturin

# Build and install in development mode
cd python
maturin develop

# Or build a wheel
maturin build --release

From PyPI (coming soon)

pip install pegaflow

Usage

Basic KV Storage

from pegaflow import PegaEngine

# Create a new engine
engine = PegaEngine()

# Store key-value pairs
engine.put("name", "PegaFlow")
engine.put("version", "0.1.0")

# Retrieve values
name = engine.get("name")  # Returns "PegaFlow"
missing = engine.get("nonexistent")  # Returns None

# Remove keys
removed = engine.remove("name")  # Returns "PegaFlow"

Sglang Examples:

example 1:

python3 -m sglang.launch_server --model-path Qwen/Qwen3-0.6B --served-model-name Qwen/Qwen3-0.6B --trust-remote-code --enable-cache-report --page-size 256 --host "0.0.0.0" --port 8000 --mem-fraction-static 0.8 --max-running-requests 32 --enable-pegaflow

example 2:

python3 -m sglang.launch_server --model-loader-extra-config "{\"enable_multithread_load\": true, \"num_threads\": 64}"  --model-path deepseek-ai/DeepSeek-V3.2 --served-model-name deepseek-ai/DeepSeek-V3.2 --trust-remote-code --page-size "64" --reasoning-parser deepseek-v3 --tool-call-parser deepseekv32 --enable-cache-report --host "0.0.0.0" --port 8031 --mem-fraction-static 0.83 --max-running-requests 64 --tp-size "8" --enable-pegaflow

vLLM KV Connector

from vllm import LLM
from vllm.distributed.kv_transfer.kv_transfer_agent import KVTransferConfig

# Configure vLLM to use PegaKVConnector
kv_transfer_config = KVTransferConfig(
    kv_connector="PegaKVConnector",
    kv_role="kv_both",
    kv_connector_module_path="pegaflow.connector",
)

# Create LLM with KV transfer enabled
llm = LLM(
    model="gpt2",
    kv_transfer_config=kv_transfer_config,
)

Development

See the examples directory for more usage examples.

Testing

Running Unit Tests

The test suite includes integration tests that verify the EngineRpcClient can correctly communicate with a running pegaflow-server instance.

Prerequisites

  1. Build the Rust extension:

    cd python
    maturin develop --release
    
  2. Build the server binary:

    cd ..
    cargo build --release --bin pegaflow-server
    
  3. Ensure CUDA is available (tests require GPU):

    python -c "import torch; assert torch.cuda.is_available()"
    

Running Tests

cd python

# Run all tests
pytest tests/ -v

# Run specific test file
pytest tests/test_engine_client.py -v

# Run with coverage
pytest tests/ --cov=pegaflow --cov-report=html

Test Structure

  • tests/conftest.py: Contains pytest fixtures for:

    • pega_server: Automatically starts/stops pegaflow-server for integration tests
    • engine_client: Creates an EngineRpcClient connected to the test server
    • client_context: Provides a ClientContext representing a vLLM instance with GPU KV cache tensors
    • registered_instance: Provides a registered instance ID for query tests
  • tests/test_engine_client.py: Integration tests for:

    • Server connectivity
    • Query operations with various inputs

Test Fixtures

The ClientContext class abstracts a vLLM instance and provides:

  • register_kv_caches(): Register GPU KV cache tensors with the server
  • query(block_hashes): Query available blocks
  • unregister_context(): Unregister context from server

Example test usage:

def test_query(client_context):
    """Test query operation."""
    result = client_context.query([])
    assert result is not None

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

pegaflow_llm-0.18.0-cp313-cp313-manylinux_2_34_x86_64.whl (8.2 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.34+ x86-64

pegaflow_llm-0.18.0-cp312-cp312-manylinux_2_34_x86_64.whl (8.2 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.34+ x86-64

pegaflow_llm-0.18.0-cp311-cp311-manylinux_2_34_x86_64.whl (8.2 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.34+ x86-64

pegaflow_llm-0.18.0-cp310-cp310-manylinux_2_34_x86_64.whl (8.2 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.34+ x86-64

File details

Details for the file pegaflow_llm-0.18.0-cp313-cp313-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for pegaflow_llm-0.18.0-cp313-cp313-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 118c821d680cab56af4623535b5a3c2b372b4328c7c4a5dac8b9c9f2385a9372
MD5 672a98ae93432d7f183164e1822d3d90
BLAKE2b-256 11edb1d9251c9588f52b01babd2fc845017594f3dcc0343ba4676a42b1d8c017

See more details on using hashes here.

File details

Details for the file pegaflow_llm-0.18.0-cp312-cp312-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for pegaflow_llm-0.18.0-cp312-cp312-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 a5bd2b21ee158150bac5e3b55287f7e0ebadffcb6a939039c4c7aefec96c53a2
MD5 e5d1b0df5fc3c5946a593de0c439dec3
BLAKE2b-256 f50a223031f7e4f09aa1949f4b71df9da684040d4e17ea8d75272377ec2f6675

See more details on using hashes here.

File details

Details for the file pegaflow_llm-0.18.0-cp311-cp311-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for pegaflow_llm-0.18.0-cp311-cp311-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 ce0f0f9a787c120ea63569fe5f7c9b2dcaa6fc80e0ac724cdb290e661ff2c6cf
MD5 701e659dcc2b4b4b1dc22997943406a4
BLAKE2b-256 5718599ee4e317e97ddaebad5734ac87ba3752318c9409e455ac0bf6db3eb93d

See more details on using hashes here.

File details

Details for the file pegaflow_llm-0.18.0-cp310-cp310-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for pegaflow_llm-0.18.0-cp310-cp310-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 b767fd282d347c78e3fc5289544e94d58d8607c9be3d2678c81397a874f553b7
MD5 bce849c95f05fa4e04189f444fffd25a
BLAKE2b-256 526560f37dacc846b1c474c43941792ff9a9d87c3f04c4c088de65a1842ea15a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page