Skip to main content

High-performance key-value storage engine with Python bindings

Project description

PegaFlow Python Package

High-performance key-value storage engine with Python bindings, built with Rust and PyO3.

Features

  • PegaEngine: Fast Rust-based key-value storage with Python bindings
  • PegaKVConnector: vLLM KV connector for distributed inference with KV cache transfer

Installation

From Source

# Install maturin if you haven't already
pip install maturin

# Build and install in development mode
cd python
maturin develop

# Or build a wheel
maturin build --release

From PyPI (coming soon)

pip install pegaflow

Usage

Basic KV Storage

from pegaflow import PegaEngine

# Create a new engine
engine = PegaEngine()

# Store key-value pairs
engine.put("name", "PegaFlow")
engine.put("version", "0.1.0")

# Retrieve values
name = engine.get("name")  # Returns "PegaFlow"
missing = engine.get("nonexistent")  # Returns None

# Remove keys
removed = engine.remove("name")  # Returns "PegaFlow"

Sglang Examples:

example 1:

python3 -m sglang.launch_server --model-path Qwen/Qwen3-0.6B --served-model-name Qwen/Qwen3-0.6B --trust-remote-code --enable-cache-report --page-size 256 --host "0.0.0.0" --port 8000 --mem-fraction-static 0.8 --max-running-requests 32 --enable-pegaflow

example 2:

python3 -m sglang.launch_server --model-loader-extra-config "{\"enable_multithread_load\": true, \"num_threads\": 64}"  --model-path deepseek-ai/DeepSeek-V3.2 --served-model-name deepseek-ai/DeepSeek-V3.2 --trust-remote-code --page-size "64" --reasoning-parser deepseek-v3 --tool-call-parser deepseekv32 --enable-cache-report --host "0.0.0.0" --port 8031 --mem-fraction-static 0.83 --max-running-requests 64 --tp-size "8" --enable-pegaflow

vLLM KV Connector

from vllm import LLM
from vllm.distributed.kv_transfer.kv_transfer_agent import KVTransferConfig

# Configure vLLM to use PegaKVConnector
kv_transfer_config = KVTransferConfig(
    kv_connector="PegaKVConnector",
    kv_role="kv_both",
    kv_connector_module_path="pegaflow.connector",
)

# Create LLM with KV transfer enabled
llm = LLM(
    model="gpt2",
    kv_transfer_config=kv_transfer_config,
)

Development

See the examples directory for more usage examples.

Testing

Running Unit Tests

The test suite includes integration tests that verify the EngineRpcClient can correctly communicate with a running pegaflow-server instance.

Prerequisites

  1. Build the Rust extension:

    cd python
    maturin develop --release
    
  2. Build the server binary:

    cd ..
    cargo build --release --bin pegaflow-server
    
  3. Ensure CUDA is available (tests require GPU):

    python -c "import torch; assert torch.cuda.is_available()"
    

Running Tests

cd python

# Run all tests
pytest tests/ -v

# Run specific test file
pytest tests/test_engine_client.py -v

# Run with coverage
pytest tests/ --cov=pegaflow --cov-report=html

Test Structure

  • tests/conftest.py: Contains pytest fixtures for:

    • pega_server: Automatically starts/stops pegaflow-server for integration tests
    • engine_client: Creates an EngineRpcClient connected to the test server
    • client_context: Provides a ClientContext representing a vLLM instance with GPU KV cache tensors
    • registered_instance: Provides a registered instance ID for query tests
  • tests/test_engine_client.py: Integration tests for:

    • Server connectivity
    • Query operations with various inputs

Test Fixtures

The ClientContext class abstracts a vLLM instance and provides:

  • register_kv_caches(): Register GPU KV cache tensors with the server
  • query(block_hashes): Query available blocks
  • unregister_context(): Unregister context from server

Example test usage:

def test_query(client_context):
    """Test query operation."""
    result = client_context.query([])
    assert result is not None

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

pegaflow_llm-0.17.0-cp313-cp313-manylinux_2_34_x86_64.whl (6.9 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.34+ x86-64

pegaflow_llm-0.17.0-cp312-cp312-manylinux_2_34_x86_64.whl (6.9 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.34+ x86-64

pegaflow_llm-0.17.0-cp311-cp311-manylinux_2_34_x86_64.whl (6.9 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.34+ x86-64

pegaflow_llm-0.17.0-cp310-cp310-manylinux_2_34_x86_64.whl (6.9 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.34+ x86-64

File details

Details for the file pegaflow_llm-0.17.0-cp313-cp313-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for pegaflow_llm-0.17.0-cp313-cp313-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 bc5ef90a14b4f3499e6c0ec2730a7ca2e547780a23b4da6cc5321f9c29d6abbd
MD5 4a48e26c81e0cacf29fb17452f006c67
BLAKE2b-256 7c28c6bd6252c4508fb6dfd9ecfded2c4fbd10308d0bbcf13fa9feef738edbb1

See more details on using hashes here.

File details

Details for the file pegaflow_llm-0.17.0-cp312-cp312-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for pegaflow_llm-0.17.0-cp312-cp312-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 4b1245ce7306febc55a84d902cdf159bb356e70c7673455dbb8d2bf5abc70104
MD5 6f6e2098c3720b07bc5fd4f551541283
BLAKE2b-256 71f5f100977df26bc34476f29bf701844af2e4b10692c8945db3e754135a8846

See more details on using hashes here.

File details

Details for the file pegaflow_llm-0.17.0-cp311-cp311-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for pegaflow_llm-0.17.0-cp311-cp311-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 ec67cb308f5d4c5fa52b3772f4f2926e50c9622548156a967c3290bc9e7f6ebc
MD5 e6c9d61d0042ba4a97f2ea1007e6fcca
BLAKE2b-256 4c1e8ee9d3baf08abc801e04df33e65e380355efe623eb389cb0a44d96d2c43e

See more details on using hashes here.

File details

Details for the file pegaflow_llm-0.17.0-cp310-cp310-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for pegaflow_llm-0.17.0-cp310-cp310-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 8a077289f4fb0a4784af98499a709ab52ddedf834419c32c4f04edfa651bfd09
MD5 d4dd6c7af2de7e26fcd8dab9893b7806
BLAKE2b-256 28f86d96a78bb88d7df08d3f05525f5cfb2168f77575fccc10bc3a32b1692300

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page