Skip to main content

High-performance key-value storage engine with Python bindings

Project description

PegaFlow Python Package

High-performance key-value storage engine with Python bindings, built with Rust and PyO3.

Features

  • PegaEngine: Fast Rust-based key-value storage with Python bindings
  • PegaKVConnector: vLLM KV connector for distributed inference with KV cache transfer

Installation

From Source

# Install maturin if you haven't already
pip install maturin

# Build and install in development mode
cd python
maturin develop

# Or build a wheel
maturin build --release

From PyPI (coming soon)

pip install pegaflow

Usage

Basic KV Storage

from pegaflow import PegaEngine

# Create a new engine
engine = PegaEngine()

# Store key-value pairs
engine.put("name", "PegaFlow")
engine.put("version", "0.1.0")

# Retrieve values
name = engine.get("name")  # Returns "PegaFlow"
missing = engine.get("nonexistent")  # Returns None

# Remove keys
removed = engine.remove("name")  # Returns "PegaFlow"

Sglang Examples:

example 1:

python3 -m sglang.launch_server --model-path Qwen/Qwen3-0.6B --served-model-name Qwen/Qwen3-0.6B --trust-remote-code --enable-cache-report --page-size 256 --host "0.0.0.0" --port 8000 --mem-fraction-static 0.8 --max-running-requests 32 --enable-pegaflow

example 2:

python3 -m sglang.launch_server --model-loader-extra-config "{\"enable_multithread_load\": true, \"num_threads\": 64}"  --model-path deepseek-ai/DeepSeek-V3.2 --served-model-name deepseek-ai/DeepSeek-V3.2 --trust-remote-code --page-size "64" --reasoning-parser deepseek-v3 --tool-call-parser deepseekv32 --enable-cache-report --host "0.0.0.0" --port 8031 --mem-fraction-static 0.83 --max-running-requests 64 --tp-size "8" --enable-pegaflow

vLLM KV Connector

from vllm import LLM
from vllm.distributed.kv_transfer.kv_transfer_agent import KVTransferConfig

# Configure vLLM to use PegaKVConnector
kv_transfer_config = KVTransferConfig(
    kv_connector="PegaKVConnector",
    kv_role="kv_both",
    kv_connector_module_path="pegaflow.connector",
)

# Create LLM with KV transfer enabled
llm = LLM(
    model="gpt2",
    kv_transfer_config=kv_transfer_config,
)

Development

See the examples directory for more usage examples.

Testing

Running Unit Tests

The test suite includes integration tests that verify the EngineRpcClient can correctly communicate with a running pegaflow-server instance.

Prerequisites

  1. Build the Rust extension:

    cd python
    maturin develop --release
    
  2. Build the server binary:

    cd ..
    cargo build --release --bin pegaflow-server
    
  3. Ensure CUDA is available (tests require GPU):

    python -c "import torch; assert torch.cuda.is_available()"
    

Running Tests

cd python

# Run all tests
pytest tests/ -v

# Run specific test file
pytest tests/test_engine_client.py -v

# Run with coverage
pytest tests/ --cov=pegaflow --cov-report=html

Test Structure

  • tests/conftest.py: Contains pytest fixtures for:

    • pega_server: Automatically starts/stops pegaflow-server for integration tests
    • engine_client: Creates an EngineRpcClient connected to the test server
    • client_context: Provides a ClientContext representing a vLLM instance with GPU KV cache tensors
    • registered_instance: Provides a registered instance ID for query tests
  • tests/test_engine_client.py: Integration tests for:

    • Server connectivity
    • Query operations with various inputs

Test Fixtures

The ClientContext class abstracts a vLLM instance and provides:

  • register_kv_caches(): Register GPU KV cache tensors with the server
  • query(block_hashes): Query available blocks
  • unregister_context(): Unregister context from server

Example test usage:

def test_query(client_context):
    """Test query operation."""
    result = client_context.query([])
    assert result is not None

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

pegaflow_llm-0.22.1-cp314-cp314-manylinux_2_34_x86_64.whl (8.6 MB view details)

Uploaded CPython 3.14manylinux: glibc 2.34+ x86-64

pegaflow_llm-0.22.1-cp313-cp313-manylinux_2_34_x86_64.whl (8.5 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.34+ x86-64

pegaflow_llm-0.22.1-cp312-cp312-manylinux_2_34_x86_64.whl (8.5 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.34+ x86-64

pegaflow_llm-0.22.1-cp311-cp311-manylinux_2_34_x86_64.whl (8.6 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.34+ x86-64

pegaflow_llm-0.22.1-cp310-cp310-manylinux_2_34_x86_64.whl (8.6 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.34+ x86-64

File details

Details for the file pegaflow_llm-0.22.1-cp314-cp314-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for pegaflow_llm-0.22.1-cp314-cp314-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 4a3da719da7360c521d85681966d6ea8c4059f476fafa015fe65d16a530a58f4
MD5 222fc976eed8ca185299c9523efa86c8
BLAKE2b-256 3b08ab7c0006076e6cc58f1b72f9ec6646784e7273c6d9b4cddac5a75e0007e6

See more details on using hashes here.

File details

Details for the file pegaflow_llm-0.22.1-cp313-cp313-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for pegaflow_llm-0.22.1-cp313-cp313-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 a4aa914fbab618a73b17a1863d7c1730f3c0f088b2dda084f53d7f377df36788
MD5 98aa2438be7dbeacc528487e7465159d
BLAKE2b-256 00c41ad815b2850c3d3efdbd584ea5202ed096e4e1cec32b817133cde07ad64c

See more details on using hashes here.

File details

Details for the file pegaflow_llm-0.22.1-cp312-cp312-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for pegaflow_llm-0.22.1-cp312-cp312-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 7163642ccb26b6c9428d4508fbddceb0319f169ae2922c535fe73613b4a3da1e
MD5 ac49f8a1ce7421f121129bf212c3ab80
BLAKE2b-256 6707d604146c867c1ab8ebf89cd7334c742180de3506d57ef093e40fb5692c19

See more details on using hashes here.

File details

Details for the file pegaflow_llm-0.22.1-cp311-cp311-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for pegaflow_llm-0.22.1-cp311-cp311-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 e94df704c05355136a3f17a44837a33ab1a06eb14d0492989e7b0bddc8212f5e
MD5 780aabc8b7ddeda0ad8c6c6cdbdb2b6f
BLAKE2b-256 172a0f321cc54e745e6e1f9aecb99bec1c75eeb70f6503e21c9f2404e12d6a43

See more details on using hashes here.

File details

Details for the file pegaflow_llm-0.22.1-cp310-cp310-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for pegaflow_llm-0.22.1-cp310-cp310-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 72ef4070acfd1c7e9bb3e2ed21a22954e76dbe68098d546d15381b9dcb03f245
MD5 d8c4de39b237b838e689b7de3e82f7a7
BLAKE2b-256 00fa3970ef8c59476e0f4b0a336c22fd9018277000bd8c0a44812e218b7662f4

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page