Skip to main content

Python Stream Processing Framework

Project description

Python Stream Processing Framework (PSPF)

PSPF is a lightweight, high-performance stream processing framework for Python. It provides Kafka-like semantics (partitions, offsets, consumer groups, exactly-once processing) without requiring Kafka or JVM infrastructure.

It is designed for building event-driven applications, event sourcing systems, and data pipelines that need to be robust, replayable, and easy to deploy.

Note: While the Quick Start below uses Valkey (Redis), PSPF is backend-agnostic! You can also use Kafka, or the local-only Memory and File backends for testing without Docker.

  • Auto-Instantiation: Simply provide a topic and group; PSPF handles backend setup automatically (Valkey with Memory fallback).
  • Decorator API: Simple @stream.subscribe and @stream.window handlers for rapid development.
  • Durable Retries: Message retry state is persisted in a StateStore, surviving worker restarts.
  • Idempotent Sinks: Built-in BaseSink for external side-effects (APIs, DBs) with automatic idempotency tokens.
  • Exactly-Once Semantics: Atomic transactions where state and offsets are committed together.
  • Reliability & DLQ: Built-in retries and Dead Letter Queues (DLQ) for failed or late events.
  • Zero-Downtime Scaling: Automatic partition rebalancing across worker clusters.
  • Cloud Native: Built-in Helm charts for Kubernetes and Prometheus monitoring.
  • Powerful CLI: Inspect logs, manage consumer groups, and handle DLQs directly.

Installation

pip install pspf

Quick Start: User Signups

PSPF makes it easy to handle high-volume event streams.

from pspf import Stream

# Auto-instantiates Valkey (fallback to Memory if Valkey is unavailable)
stream = Stream(topic="signups", group="group1")

@stream.subscribe("signups")
async def handle_signup(event):
    print(f"Welcome {event['email']}!")

if __name__ == "__main__":
    import asyncio
    asyncio.run(stream.run_forever())

1. Requirements

Ensure you have Valkey (or Redis) running:

docker run -d -p 6379:6379 valkey/valkey:latest

2. Run the Demo

The included valkey_demo.py example demonstrates a producer and consumer running together.

python examples/valkey_demo.py

3. Verification

You will see logs showing events being produced and then consumed with their respective offsets.

How It Works

  1. Producers emit events to a Stream using a specific Backend (e.g., ValkeyStreamBackend).
  2. Workers consume from the stream in Consumer Groups, sharing the load across multiple instances.
  3. Processors handle message batches, providing built-in retries, deduplication, and dead-letter routing.
  4. Observability is baked in; every processed message updates metrics and traces.

Project Structure

pspf/
├── connectors/   # Backend implementations (Valkey, Kafka, Memory)
├── processing/   # Core logic (BatchProcessor, DLO, Retries)
├── state/        # Stateful processing backends (SQLite, etc.)
└── stream.py     # Main Stream facade

Check out the Tutorial for a deeper dive.

Documentation

About the Author

PSPF™ was designed and developed by Joseph Hall.

I built this framework to solve specific challenges in stream processing—focusing on low-latency data handling, simplified deployments without JVM overhead, and ease of use. While my current professional background is in high-volume customer success and sales, my passion lies in engineering robust, scalable Python tools.

🚀 Let's Connect

I am currently looking for my next challenge in Software Engineering, Data Engineering, or DevOps. If you’re looking for a developer who understands both clean code and how to communicate with stakeholders, let’s talk!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pspf-0.1.0b1.tar.gz (47.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pspf-0.1.0b1-py3-none-any.whl (61.2 kB view details)

Uploaded Python 3

File details

Details for the file pspf-0.1.0b1.tar.gz.

File metadata

  • Download URL: pspf-0.1.0b1.tar.gz
  • Upload date:
  • Size: 47.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.2 CPython/3.12.3 Linux/6.6.87.2-microsoft-standard-WSL2

File hashes

Hashes for pspf-0.1.0b1.tar.gz
Algorithm Hash digest
SHA256 c04e1dc2dea2eb4aa70ccde870052cd919d0350167fb53eb0116e87369fcd058
MD5 e1756d4f572156262e578b011d456659
BLAKE2b-256 529c6a9f3efd38a80ee683da3c08ce546d5a7b86fc85b92b9db24cc2d5b2a7b7

See more details on using hashes here.

File details

Details for the file pspf-0.1.0b1-py3-none-any.whl.

File metadata

  • Download URL: pspf-0.1.0b1-py3-none-any.whl
  • Upload date:
  • Size: 61.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.2 CPython/3.12.3 Linux/6.6.87.2-microsoft-standard-WSL2

File hashes

Hashes for pspf-0.1.0b1-py3-none-any.whl
Algorithm Hash digest
SHA256 e589e22131ab5fca1be20800fdb9b2645fe18b5d25409c05698270fc1d888aae
MD5 0e8c821fedfa598381dfdd15ba55936b
BLAKE2b-256 9376fb1d84278801c668e75b2aa46c26649296e568dd9c8451386f09b9b9e11c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page