Skip to main content

High-performance key-value storage engine with Python bindings

Project description

PegaFlow Python Package

High-performance key-value storage engine with Python bindings, built with Rust and PyO3.

Features

  • PegaEngine: Fast Rust-based key-value storage with Python bindings
  • PegaKVConnector: vLLM KV connector for distributed inference with KV cache transfer

Installation

From Source

# Install maturin if you haven't already
pip install maturin

# Build and install in development mode
cd python
maturin develop

# Or build a wheel
maturin build --release

From PyPI (coming soon)

pip install pegaflow

Usage

Basic KV Storage

from pegaflow import PegaEngine

# Create a new engine
engine = PegaEngine()

# Store key-value pairs
engine.put("name", "PegaFlow")
engine.put("version", "0.1.0")

# Retrieve values
name = engine.get("name")  # Returns "PegaFlow"
missing = engine.get("nonexistent")  # Returns None

# Remove keys
removed = engine.remove("name")  # Returns "PegaFlow"

Sglang Examples:

example 1:

python3 -m sglang.launch_server --model-path Qwen/Qwen3-0.6B --served-model-name Qwen/Qwen3-0.6B --trust-remote-code --enable-cache-report --page-size 256 --host "0.0.0.0" --port 8000 --mem-fraction-static 0.8 --max-running-requests 32 --enable-pegaflow

example 2:

python3 -m sglang.launch_server --model-loader-extra-config "{\"enable_multithread_load\": true, \"num_threads\": 64}"  --model-path deepseek-ai/DeepSeek-V3.2 --served-model-name deepseek-ai/DeepSeek-V3.2 --trust-remote-code --page-size "64" --reasoning-parser deepseek-v3 --tool-call-parser deepseekv32 --enable-cache-report --host "0.0.0.0" --port 8031 --mem-fraction-static 0.83 --max-running-requests 64 --tp-size "8" --enable-pegaflow

vLLM KV Connector

from vllm import LLM
from vllm.distributed.kv_transfer.kv_transfer_agent import KVTransferConfig

# Configure vLLM to use PegaKVConnector
kv_transfer_config = KVTransferConfig(
    kv_connector="PegaKVConnector",
    kv_role="kv_both",
    kv_connector_module_path="pegaflow.connector",
)

# Create LLM with KV transfer enabled
llm = LLM(
    model="gpt2",
    kv_transfer_config=kv_transfer_config,
)

Development

See the examples directory for more usage examples.

Testing

Running Unit Tests

The test suite includes integration tests that verify the EngineRpcClient can correctly communicate with a running pegaflow-server instance.

Prerequisites

  1. Build the Rust extension:

    cd python
    maturin develop --release
    
  2. Build the server binary:

    cd ..
    cargo build --release --bin pegaflow-server
    
  3. Ensure CUDA is available (tests require GPU):

    python -c "import torch; assert torch.cuda.is_available()"
    

Running Tests

cd python

# Run all tests
pytest tests/ -v

# Run specific test file
pytest tests/test_engine_client.py -v

# Run with coverage
pytest tests/ --cov=pegaflow --cov-report=html

Test Structure

  • tests/conftest.py: Contains pytest fixtures for:

    • pega_server: Automatically starts/stops pegaflow-server for integration tests
    • engine_client: Creates an EngineRpcClient connected to the test server
    • client_context: Provides a ClientContext representing a vLLM instance with GPU KV cache tensors
    • registered_instance: Provides a registered instance ID for query tests
  • tests/test_engine_client.py: Integration tests for:

    • Server connectivity
    • Query operations with various inputs

Test Fixtures

The ClientContext class abstracts a vLLM instance and provides:

  • register_kv_caches(): Register GPU KV cache tensors with the server
  • query(block_hashes): Query available blocks
  • unregister_context(): Unregister context from server

Example test usage:

def test_query(client_context):
    """Test query operation."""
    result = client_context.query([])
    assert result is not None

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

pegaflow_llm_cu13-0.21.2-cp314-cp314-manylinux_2_34_x86_64.whl (8.5 MB view details)

Uploaded CPython 3.14manylinux: glibc 2.34+ x86-64

pegaflow_llm_cu13-0.21.2-cp313-cp313-manylinux_2_34_x86_64.whl (8.5 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.34+ x86-64

pegaflow_llm_cu13-0.21.2-cp312-cp312-manylinux_2_34_x86_64.whl (8.5 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.34+ x86-64

pegaflow_llm_cu13-0.21.2-cp311-cp311-manylinux_2_34_x86_64.whl (8.5 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.34+ x86-64

pegaflow_llm_cu13-0.21.2-cp310-cp310-manylinux_2_34_x86_64.whl (8.5 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.34+ x86-64

File details

Details for the file pegaflow_llm_cu13-0.21.2-cp314-cp314-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for pegaflow_llm_cu13-0.21.2-cp314-cp314-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 e7fef6fb9874a5a90004abcc3e47c9e155f9e7a4a6d3d5951fe1ba8d5c594aba
MD5 4da9890f9cec201c82b66328f5692757
BLAKE2b-256 78b521400dccc584e3a85619a8552740ca50a7bdb0457d7bfba6e0977297b2e6

See more details on using hashes here.

File details

Details for the file pegaflow_llm_cu13-0.21.2-cp313-cp313-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for pegaflow_llm_cu13-0.21.2-cp313-cp313-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 4b1596e07849061a990e956aef0d26e3668fc89bf88b4be821d2d50eb2bd6a84
MD5 8d347a225b22393987347a2338ef5b80
BLAKE2b-256 59c1671e3ee21db5433ebb5fad9dc228c482603a605bbfa9ae840862a369238e

See more details on using hashes here.

File details

Details for the file pegaflow_llm_cu13-0.21.2-cp312-cp312-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for pegaflow_llm_cu13-0.21.2-cp312-cp312-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 4db4ea9fe3d111d60d6069e4f6313357008ff04bbe317d4ec3c073ee4c0a08b6
MD5 3d93acff5aac28fa7167c2cbe6bcd69c
BLAKE2b-256 719366a0acd13a819b1560f1582b5b2e882b2641532db60a4dcd8f16c11bc8b2

See more details on using hashes here.

File details

Details for the file pegaflow_llm_cu13-0.21.2-cp311-cp311-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for pegaflow_llm_cu13-0.21.2-cp311-cp311-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 f778b995946ee25b607db247692e7b0dc53f4cbe9d05020e33be2b9b608db2b9
MD5 fdccfc8598c349c03295bf95da08f08c
BLAKE2b-256 af73df6237ea6ddf3edd4cfd1e2f1789b6c3586dd5b9e35f7241f4944c39606b

See more details on using hashes here.

File details

Details for the file pegaflow_llm_cu13-0.21.2-cp310-cp310-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for pegaflow_llm_cu13-0.21.2-cp310-cp310-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 cfc46873832c08aec703eeaa55aca5daf69343e81de995aac57457eb8df82304
MD5 de00cbeb45e4bd2366c3fcf06f5f886b
BLAKE2b-256 3b99b4c40e6d1030ca7641d586963d2035843884460f7b69bb51d7315979224f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page