Skip to main content

High-performance key-value storage engine with Python bindings

Project description

PegaFlow Python Package

High-performance key-value storage engine with Python bindings, built with Rust and PyO3.

Features

  • PegaEngine: Fast Rust-based key-value storage with Python bindings
  • PegaKVConnector: vLLM KV connector for distributed inference with KV cache transfer

Installation

From Source

# Install maturin if you haven't already
pip install maturin

# Build and install in development mode
cd python
maturin develop

# Or build a wheel
maturin build --release

From PyPI (coming soon)

pip install pegaflow

Usage

Basic KV Storage

from pegaflow import PegaEngine

# Create a new engine
engine = PegaEngine()

# Store key-value pairs
engine.put("name", "PegaFlow")
engine.put("version", "0.1.0")

# Retrieve values
name = engine.get("name")  # Returns "PegaFlow"
missing = engine.get("nonexistent")  # Returns None

# Remove keys
removed = engine.remove("name")  # Returns "PegaFlow"

Sglang Examples:

example 1:

python3 -m sglang.launch_server --model-path Qwen/Qwen3-0.6B --served-model-name Qwen/Qwen3-0.6B --trust-remote-code --enable-cache-report --page-size 256 --host "0.0.0.0" --port 8000 --mem-fraction-static 0.8 --max-running-requests 32 --enable-pegaflow

example 2:

python3 -m sglang.launch_server --model-loader-extra-config "{\"enable_multithread_load\": true, \"num_threads\": 64}"  --model-path deepseek-ai/DeepSeek-V3.2 --served-model-name deepseek-ai/DeepSeek-V3.2 --trust-remote-code --page-size "64" --reasoning-parser deepseek-v3 --tool-call-parser deepseekv32 --enable-cache-report --host "0.0.0.0" --port 8031 --mem-fraction-static 0.83 --max-running-requests 64 --tp-size "8" --enable-pegaflow

vLLM KV Connector

from vllm import LLM
from vllm.distributed.kv_transfer.kv_transfer_agent import KVTransferConfig

# Configure vLLM to use PegaKVConnector
kv_transfer_config = KVTransferConfig(
    kv_connector="PegaKVConnector",
    kv_role="kv_both",
    kv_connector_module_path="pegaflow.connector",
)

# Create LLM with KV transfer enabled
llm = LLM(
    model="gpt2",
    kv_transfer_config=kv_transfer_config,
)

Development

See the examples directory for more usage examples.

Testing

Running Unit Tests

The test suite includes integration tests that verify the EngineRpcClient can correctly communicate with a running pegaflow-server instance.

Prerequisites

  1. Build the Rust extension:

    cd python
    maturin develop --release
    
  2. Build the server binary:

    cd ..
    cargo build --release --bin pegaflow-server
    
  3. Ensure CUDA is available (tests require GPU):

    python -c "import torch; assert torch.cuda.is_available()"
    

Running Tests

cd python

# Run all tests
pytest tests/ -v

# Run specific test file
pytest tests/test_engine_client.py -v

# Run with coverage
pytest tests/ --cov=pegaflow --cov-report=html

Test Structure

  • tests/conftest.py: Contains pytest fixtures for:

    • pega_server: Automatically starts/stops pegaflow-server for integration tests
    • engine_client: Creates an EngineRpcClient connected to the test server
    • client_context: Provides a ClientContext representing a vLLM instance with GPU KV cache tensors
    • registered_instance: Provides a registered instance ID for query tests
  • tests/test_engine_client.py: Integration tests for:

    • Server connectivity
    • Query operations with various inputs

Test Fixtures

The ClientContext class abstracts a vLLM instance and provides:

  • register_kv_caches(): Register GPU KV cache tensors with the server
  • query(block_hashes): Query available blocks
  • unregister_context(): Unregister context from server

Example test usage:

def test_query(client_context):
    """Test query operation."""
    result = client_context.query([])
    assert result is not None

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

pegaflow_llm-0.19.0-cp313-cp313-manylinux_2_34_x86_64.whl (8.2 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.34+ x86-64

pegaflow_llm-0.19.0-cp312-cp312-manylinux_2_34_x86_64.whl (8.2 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.34+ x86-64

pegaflow_llm-0.19.0-cp311-cp311-manylinux_2_34_x86_64.whl (8.2 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.34+ x86-64

pegaflow_llm-0.19.0-cp310-cp310-manylinux_2_34_x86_64.whl (8.2 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.34+ x86-64

File details

Details for the file pegaflow_llm-0.19.0-cp313-cp313-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for pegaflow_llm-0.19.0-cp313-cp313-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 fc406d5e4ddb38f6498ec75dab116d704c74e787e704f27eddff62a4b2aaef24
MD5 db433e226a99eb40d5a9aa1e5df9bf0c
BLAKE2b-256 cf2ab56ee80417f56f60b1b8f9e42ca8c980ddee154e372c76ffdbf6bd80f592

See more details on using hashes here.

File details

Details for the file pegaflow_llm-0.19.0-cp312-cp312-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for pegaflow_llm-0.19.0-cp312-cp312-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 8db9c780a2ef952e5bdfb1c371e4144769d7516deb4293f83e674d1cbffb4669
MD5 905ad4383cd2a0ae454bc0d8b021682d
BLAKE2b-256 267162d66001fe425892232576324aab5da1b6fd0bc6624b19af1f69bf63fcb6

See more details on using hashes here.

File details

Details for the file pegaflow_llm-0.19.0-cp311-cp311-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for pegaflow_llm-0.19.0-cp311-cp311-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 59bb635e2c25a29207ed88f292aa7eefe2b24c26e63636d9ae0992f3d3153353
MD5 cdb68a2cb43bf9766211671c2b9625b7
BLAKE2b-256 7c3cc8cbbe667c8a3d25e76ce246ed9695f3884b55286eaac4ce8ae5f92b0eb8

See more details on using hashes here.

File details

Details for the file pegaflow_llm-0.19.0-cp310-cp310-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for pegaflow_llm-0.19.0-cp310-cp310-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 a227a6bff44abeb1776d1b2609ac12ed22f1339431880ade5d79dfdf76f88982
MD5 952402eac76632204b581377f23b74d7
BLAKE2b-256 9c5b5eda06ecd96d259eb81b6088756c12c2a7a2794caf3d45b25edc096d062d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page