Python library for building stream processing applications with Apache Kafka

These details have not been verified by PyPI

Project links

Homepage

Project description

Quix - React to data, fast

PyPI License

100% Python Stream Processing for Apache Kafka

Quix Streams is a cloud-native library for processing data in Kafka using pure Python. It’s designed to give you the power of a distributed system in a lightweight library by combining Kafka's low-level scalability and resiliency features with an easy-to-use Python interface (to ease newcomers to stream processing).

It has the following benefits:

Streaming DataFrame API (similar to pandas DataFrame) for tabular data transformations.
Custom stateful operations via a state object.
Custom reducing and aggregating over tumbling and hopping time windows.
Exactly-once processing semantics via Kafka transactions.
Pure Python with no need for a server-side engine.

Use Quix Streams to build simple Kafka producer/consumer applications or leverage stream processing to build complex event-driven systems, real-time data pipelines and AI/ML products.

Getting Started 🏄

Install Quix Streams

# PyPI
python -m pip install quixstreams

# or conda
conda install -c conda-forge quixio::quixstreams

Requirements

Python 3.9+, Apache Kafka 0.10+

See requirements.txt for the full list of requirements

Documentation

Quix Streams Docs

Example

Here's an example of how to process data from a Kafka Topic with Quix Streams:

from quixstreams import Application

# A minimal application reading temperature data in Celsius from the Kafka topic,
# converting it to Fahrenheit and producing alerts to another topic.

# Define an application that will connect to Kafka
app = Application(
    broker_address="localhost:9092",  # Kafka broker address
)

# Define the Kafka topics
temperature_topic = app.topic("temperature-celsius", value_deserializer="json")
alerts_topic = app.topic("temperature-alerts", value_serializer="json")

# Create a Streaming DataFrame connected to the input Kafka topic
sdf = app.dataframe(topic=temperature_topic)

# Convert temperature to Fahrenheit by transforming the input message (with an anonymous or user-defined function)
sdf = sdf.apply(lambda value: {"temperature_F": (value["temperature"] * 9/5) + 32})

# Filter values above the threshold
sdf = sdf[sdf["temperature_F"] > 150]

# Produce alerts to the output topic
sdf = sdf.to_topic(alerts_topic)

# Run the streaming application (app automatically tracks the sdf!)
app.run()

Tutorials

To see Quix Streams in action, check out the Quickstart and Tutorials in the docs:

Key Concepts

There are two primary objects:

StreamingDataFrame - a predefined declarative pipeline to process and transform incoming messages.
Application - to manage the Kafka-related setup, teardown and message lifecycle (consuming, committing). It processes each message with the dataframe you provide for it to run.

Under the hood, the Application will:

Consume and deserialize messages.
Process them with your StreamingDataFrame.
Produce it to the output topic.
Automatically checkpoint processed messages and state for resiliency.
Scale using Kafka's built-in consumer groups mechanism.

Deployment

You can run Quix Streams pipelines anywhere Python is installed.

Deploy to your own infrastructure or to Quix Cloud on AWS, Azure, GCP or on-premise for a fully managed platform.
You'll get self-service DevOps, CI/CD and monitoring, all built with best in class engineering practices learned from Formula 1 Racing.

Please see the Connecting to Quix Cloud page to learn how to use Quix Streams and Quix Cloud together.

Roadmap 📍

This library is being actively developed by a full-time team.

Here are some of the planned improvements:

Windowed aggregations over Tumbling & Hopping windows
Stateful operations and recovery based on Kafka changelog topics
Group-by operation
"Exactly Once" delivery guarantees for Kafka message processing (AKA transactions)
Support for Avro and Protobuf formats
Schema Registry support
Joins
Windowed aggregations over Sliding windows

For a more detailed overview of the planned features, please look at the Roadmap Board.

Get Involved 🤝

Please use GitHub issues to report bugs and suggest new features.
Join the Quix Community on Slack, a vibrant group of Kafka Python developers, data engineers and newcomers to Apache Kafka, who are learning and leveraging Quix Streams for real-time data processing.
Watch and subscribe to @QuixStreams on YouTube for code-along tutorials from scratch and interesting community highlights.
Follow us on X and LinkedIn where we share our latest tutorials, forthcoming community events and the occasional meme.
If you have any questions or feedback - write to us at support@quix.io!

License 📗

Quix Streams is licensed under the Apache 2.0 license.
View a copy of the License file here.

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

This version

3.0.0

Oct 10, 2024

3.0.0b0 pre-release

Oct 2, 2024

2.11.1

Sep 25, 2024

2.11.0

Sep 20, 2024

2.10.0

Aug 30, 2024

2.9.0

Aug 8, 2024

2.8.1

Jul 22, 2024

2.8.0

Jul 18, 2024

2.7.0

Jul 4, 2024

2.6.0

Jun 19, 2024

2.5.1

May 27, 2024

2.5.0

May 16, 2024

2.4.2

Apr 29, 2024

2.4.1

Apr 4, 2024

2.4.0

Apr 4, 2024

2.3.3

Mar 25, 2024

2.3.2

Mar 4, 2024

2.3.1

Feb 15, 2024

2.2.1a0 pre-release

Feb 5, 2024

2.2.0a0 pre-release

Jan 19, 2024

2.1.4a0 pre-release

Jan 19, 2024

2.1.3a0 pre-release

Jan 11, 2024

2.1.2a0 pre-release

Nov 27, 2023

2.1.1a0 pre-release

Nov 23, 2023

2.1a1 pre-release

Nov 15, 2023

2.0a2 pre-release

Nov 7, 2023

0.5.7

Sep 28, 2023

0.5.6

Sep 20, 2023

0.5.5

Aug 15, 2023

0.5.4

Jun 6, 2023

0.5.3

Apr 27, 2023

0.5.2

Apr 21, 2023

0.5.1

Mar 31, 2023

0.5.0

Mar 1, 2023

0.5.0.dev4 pre-release yanked

Jan 25, 2023

Reason this release was yanked:

use 0.5.0

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

quixstreams-3.0.0-py3-none-any.whl (180.9 kB view hashes)

Uploaded Oct 10, 2024 Python 3

Hashes for quixstreams-3.0.0-py3-none-any.whl

Hashes for quixstreams-3.0.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`f3d3bf384982920cd81b1dd06fe302c421f8be24305802b7f782d800e38e34c5`
MD5	`f33e4f82d45db8f838d5bb98258b8743`
BLAKE2b-256	`40237b6390ef736770211d6d30def3add74c5793ee114e5570aed154f6e83de1`