Skip to main content

High-performance key-value storage engine with Python bindings

Project description

PegaFlow Python Package

High-performance key-value storage engine with Python bindings, built with Rust and PyO3.

Features

  • PegaEngine: Fast Rust-based key-value storage with Python bindings
  • PegaKVConnector: vLLM KV connector for distributed inference with KV cache transfer

Installation

From Source

# Install maturin if you haven't already
pip install maturin

# Build and install in development mode
cd python
maturin develop

# Or build a wheel
maturin build --release

From PyPI (coming soon)

pip install pegaflow

Usage

Basic KV Storage

from pegaflow import PegaEngine

# Create a new engine
engine = PegaEngine()

# Store key-value pairs
engine.put("name", "PegaFlow")
engine.put("version", "0.1.0")

# Retrieve values
name = engine.get("name")  # Returns "PegaFlow"
missing = engine.get("nonexistent")  # Returns None

# Remove keys
removed = engine.remove("name")  # Returns "PegaFlow"

Sglang Examples:

example 1:

python3 -m sglang.launch_server --model-path Qwen/Qwen3-0.6B --served-model-name Qwen/Qwen3-0.6B --trust-remote-code --enable-cache-report --page-size 256 --host "0.0.0.0" --port 8000 --mem-fraction-static 0.8 --max-running-requests 32 --enable-pegaflow

example 2:

python3 -m sglang.launch_server --model-loader-extra-config "{\"enable_multithread_load\": true, \"num_threads\": 64}"  --model-path deepseek-ai/DeepSeek-V3.2 --served-model-name deepseek-ai/DeepSeek-V3.2 --trust-remote-code --page-size "64" --reasoning-parser deepseek-v3 --tool-call-parser deepseekv32 --enable-cache-report --host "0.0.0.0" --port 8031 --mem-fraction-static 0.83 --max-running-requests 64 --tp-size "8" --enable-pegaflow

vLLM KV Connector

from vllm import LLM
from vllm.distributed.kv_transfer.kv_transfer_agent import KVTransferConfig

# Configure vLLM to use PegaKVConnector
kv_transfer_config = KVTransferConfig(
    kv_connector="PegaKVConnector",
    kv_role="kv_both",
    kv_connector_module_path="pegaflow.connector",
)

# Create LLM with KV transfer enabled
llm = LLM(
    model="gpt2",
    kv_transfer_config=kv_transfer_config,
)

Development

See the examples directory for more usage examples.

Testing

Running Unit Tests

The test suite includes integration tests that verify the EngineRpcClient can correctly communicate with a running pegaflow-server instance.

Prerequisites

  1. Build the Rust extension:

    cd python
    maturin develop --release
    
  2. Build the server binary:

    cd ..
    cargo build --release --bin pegaflow-server
    
  3. Ensure CUDA is available (tests require GPU):

    python -c "import torch; assert torch.cuda.is_available()"
    

Running Tests

cd python

# Run all tests
pytest tests/ -v

# Run specific test file
pytest tests/test_engine_client.py -v

# Run with coverage
pytest tests/ --cov=pegaflow --cov-report=html

Test Structure

  • tests/conftest.py: Contains pytest fixtures for:

    • pega_server: Automatically starts/stops pegaflow-server for integration tests
    • engine_client: Creates an EngineRpcClient connected to the test server
    • client_context: Provides a ClientContext representing a vLLM instance with GPU KV cache tensors
    • registered_instance: Provides a registered instance ID for query tests
  • tests/test_engine_client.py: Integration tests for:

    • Server connectivity
    • Query operations with various inputs

Test Fixtures

The ClientContext class abstracts a vLLM instance and provides:

  • register_kv_caches(): Register GPU KV cache tensors with the server
  • query(block_hashes): Query available blocks
  • unregister_context(): Unregister context from server

Example test usage:

def test_query(client_context):
    """Test query operation."""
    result = client_context.query([])
    assert result is not None

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

pegaflow_llm_cu13-0.18.0-cp313-cp313-manylinux_2_34_x86_64.whl (8.2 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.34+ x86-64

pegaflow_llm_cu13-0.18.0-cp312-cp312-manylinux_2_34_x86_64.whl (8.2 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.34+ x86-64

pegaflow_llm_cu13-0.18.0-cp311-cp311-manylinux_2_34_x86_64.whl (8.2 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.34+ x86-64

pegaflow_llm_cu13-0.18.0-cp310-cp310-manylinux_2_34_x86_64.whl (8.2 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.34+ x86-64

File details

Details for the file pegaflow_llm_cu13-0.18.0-cp313-cp313-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for pegaflow_llm_cu13-0.18.0-cp313-cp313-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 11c2eb849db8f82c7fec93ddb25117be5249cf2e5eb3cc1cd09d11bd2b09624e
MD5 c62e40e86f109c0f8025e61f7f38e6ec
BLAKE2b-256 06f1c45327b536f0507c3b5977ee116f28487d168ab713f2f5ea88ad0b0ef51a

See more details on using hashes here.

File details

Details for the file pegaflow_llm_cu13-0.18.0-cp312-cp312-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for pegaflow_llm_cu13-0.18.0-cp312-cp312-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 4bee5e070ba06f2ed8356bc3589399131357c3b74d1dd3676ed4cffcee9a4867
MD5 3623f63c4af1a0293ec529c4bff04db6
BLAKE2b-256 9f2c67f4b78b1278361d9948cf94da7a7d13ac8e522d595e9a979ae9d8731dda

See more details on using hashes here.

File details

Details for the file pegaflow_llm_cu13-0.18.0-cp311-cp311-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for pegaflow_llm_cu13-0.18.0-cp311-cp311-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 c5d984adfc016be236b77a790cbf554cd279c560e11044bf24bfe2134f8d1976
MD5 0e60f20b0dfe0056471058e1504d9224
BLAKE2b-256 3256bc6c26f820068a34fe8f3279f14e7547ddb4ddc96113065e34c9829796f4

See more details on using hashes here.

File details

Details for the file pegaflow_llm_cu13-0.18.0-cp310-cp310-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for pegaflow_llm_cu13-0.18.0-cp310-cp310-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 fcd1813377ef8083828af69248b7ec7ef255ca9f5431d8bdcbb4c59a5269a9b0
MD5 3af2670c9ceb2a56cc1fdea5d629ac84
BLAKE2b-256 1feb95119d83b24bc3ca49ceb007a23e99ba8da0fd95ac43fb516c4ff6d60b6e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page