Skip to main content

Rust based, Arrow powered, python kafka client

Project description

kafkars

PyPI Downloads Documentation Python License GitHub stars GitHub repo size GitHub issues PRs Welcome Rust Apache Arrow Ruff

Rust-based, Arrow-powered Python Kafka client for high-throughput data pipelines.

Motivation

Python's Global Interpreter Lock (GIL) and memory management create bottlenecks when consuming high-volume Kafka streams. Traditional Python Kafka clients process messages one at a time, requiring serialization/deserialization overhead for each message and limiting throughput.

kafkars solves this by:

  • Rust core: All Kafka operations (polling, buffering, ordering) happen in Rust, bypassing the GIL
  • Batch processing: Messages are accumulated and returned as Apache Arrow RecordBatches, not individual Python objects
  • Zero-copy where possible: Arrow's columnar format enables efficient data transfer between Rust and Python
  • Vectorized operations: Process thousands of messages at once with pandas, polars, or any Arrow-compatible library

This architecture is ideal for:

  • Real-time analytics pipelines
  • ML feature stores consuming from Kafka
  • High-volume event processing
  • Data lake ingestion

Important: Analytics-Focused Design

kafkars does not commit offsets. It is designed for analytics and high-throughput batch processing, not transactional workloads.

  • No exactly-once semantics: Messages may be reprocessed if your application restarts
  • No offset tracking: You control where to start reading via offset policies
  • Stateless consumers: Each consumer instance starts fresh based on the configured policy

If you need exactly-once processing, transactional guarantees, or automatic offset management, use a traditional Kafka client like confluent-kafka-python.

Features

  • Ordered delivery: Messages released in timestamp order across all partitions
  • Flexible offset policies: Start from earliest, latest, or any timestamp
  • Backpressure management: Automatically pauses fast partitions to prevent memory overflow
  • Arrow-native output: Returns PyArrow RecordBatch for efficient downstream processing

Installation

pip install kafkars

Quick Start

import time
from kafkars import ConsumerManager, SourceTopic

# Define source topics with offset policies
topics = [
    SourceTopic.from_earliest("events"),
    SourceTopic.from_relative_time("metrics", 3600_000),  # 1 hour ago
]

# Create consumer
manager = ConsumerManager(
    config={
        "bootstrap.servers": "localhost:9092",
        "group.id": "my-consumer-group",
    },
    topics=topics,
    cutoff_ms=int(time.time() * 1000),
    batch_size=10_000,
)

# Poll returns PyArrow RecordBatch
while True:
    batch = manager.poll(timeout_ms=1000)
    if batch.num_rows > 0:
        # Convert to pandas/polars for processing
        df = batch.to_pandas()
        print(f"Received {len(df)} messages")

    if manager.is_live():
        print("Caught up to real-time")
        break

Message Schema

Each poll returns a RecordBatch with the following schema:

Column Type Description
key binary Message key (nullable)
value binary Message payload (nullable)
topic utf8 Source topic name
partition int32 Partition number
offset int64 Message offset
timestamp timestamp[ms, UTC] Message timestamp

Offset Policies

Policy Description
from_earliest(topic) Start from the beginning
from_latest(topic) Start from the end (new messages only)
from_relative_time(topic, ms) Start from N milliseconds ago
from_absolute_time(topic, ms) Start from specific Unix timestamp

Documentation

Full documentation is available at kafkars.readthedocs.io.

License

Apache 2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kafkars-0.0.3.tar.gz (114.0 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

kafkars-0.0.3-cp313-cp313t-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.7 MB view details)

Uploaded CPython 3.13tmanylinux: glibc 2.17+ x86-64

kafkars-0.0.3-cp313-cp313t-macosx_11_0_arm64.whl (1.1 MB view details)

Uploaded CPython 3.13tmacOS 11.0+ ARM64

kafkars-0.0.3-cp313-cp313t-macosx_10_12_x86_64.whl (545.2 kB view details)

Uploaded CPython 3.13tmacOS 10.12+ x86-64

kafkars-0.0.3-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.8 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ x86-64

kafkars-0.0.3-cp310-abi3-macosx_11_0_arm64.whl (1.1 MB view details)

Uploaded CPython 3.10+macOS 11.0+ ARM64

kafkars-0.0.3-cp310-abi3-macosx_10_12_x86_64.whl (551.6 kB view details)

Uploaded CPython 3.10+macOS 10.12+ x86-64

File details

Details for the file kafkars-0.0.3.tar.gz.

File metadata

  • Download URL: kafkars-0.0.3.tar.gz
  • Upload date:
  • Size: 114.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for kafkars-0.0.3.tar.gz
Algorithm Hash digest
SHA256 f9c42f2c933d48d80994e7c247424ea0ea9698ed07e4417143b5ae2a6bae348c
MD5 33f91dd3eb2dbd01763bde1afc3e3a66
BLAKE2b-256 50bf534fb98774b7b914a7b20d53ec2adda621df90b4fc3dbdc1677a1e166b1b

See more details on using hashes here.

Provenance

The following attestation bundles were made for kafkars-0.0.3.tar.gz:

Publisher: ci.yaml on 0x26res/kafkars

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kafkars-0.0.3-cp313-cp313t-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for kafkars-0.0.3-cp313-cp313t-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 719c313defee7fcd14727c1fd92de21274b9dc2a148e447bcdf59f8e7f83a4a2
MD5 f5c9b0de6ee5ff683736db7ef315f1f1
BLAKE2b-256 76e0461583be40efdbe875ddc2fd5d5da98a64ae1cf97cbc0f0493d1794fe3d4

See more details on using hashes here.

Provenance

The following attestation bundles were made for kafkars-0.0.3-cp313-cp313t-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: ci.yaml on 0x26res/kafkars

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kafkars-0.0.3-cp313-cp313t-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for kafkars-0.0.3-cp313-cp313t-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 7794ae1e0a6fb66c7de3a4a0f3ae72a39161930cf89033101f7709c334c11ba6
MD5 3e8f9bc187431bbb6739566b3d019b7c
BLAKE2b-256 a30aeecac3767fc4dedd4db7094b2c788fb86807cb4f4e025f5d45d1cc8fcc1b

See more details on using hashes here.

Provenance

The following attestation bundles were made for kafkars-0.0.3-cp313-cp313t-macosx_11_0_arm64.whl:

Publisher: ci.yaml on 0x26res/kafkars

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kafkars-0.0.3-cp313-cp313t-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for kafkars-0.0.3-cp313-cp313t-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 e2b45143d254d053285fc4934beb34d53a882910f0159abe829e01ad864f9867
MD5 ffac7f7a534e8d2b975162e024f29d7f
BLAKE2b-256 da0fade80e81851c849a6d66555f3523044ea3521ba33f2cce88ad38c51b9145

See more details on using hashes here.

Provenance

The following attestation bundles were made for kafkars-0.0.3-cp313-cp313t-macosx_10_12_x86_64.whl:

Publisher: ci.yaml on 0x26res/kafkars

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kafkars-0.0.3-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for kafkars-0.0.3-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 56deacc3e346a08ca80f372a0596aa78bea3f14e342d9f291dcaf9a5009fb317
MD5 fac5496189443c69a2b4cd36c31819ca
BLAKE2b-256 dbd2ffbfa01da944d41b3033d3e69e020ca267898385f498ea6bedc5ee7bf27f

See more details on using hashes here.

Provenance

The following attestation bundles were made for kafkars-0.0.3-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: ci.yaml on 0x26res/kafkars

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kafkars-0.0.3-cp310-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for kafkars-0.0.3-cp310-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 89c2f7a06d9d9faf371786dbd4a3fddfc11aa50d604f4baddfad8edca2e6b5b2
MD5 7a92b048d918b2d71d21b5cc0b8b2383
BLAKE2b-256 7ae6ae1dffb55a7f00590bbd22a3909074a17798f1906c5a83344553fe245160

See more details on using hashes here.

Provenance

The following attestation bundles were made for kafkars-0.0.3-cp310-abi3-macosx_11_0_arm64.whl:

Publisher: ci.yaml on 0x26res/kafkars

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kafkars-0.0.3-cp310-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for kafkars-0.0.3-cp310-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 9a6ff6b087550ecb1f020800c7664540d7d15c732045a42498234ad7b3caa46f
MD5 128e5c5c9e8e9426e9fcd17c73f66d6c
BLAKE2b-256 2c7c5fa82c2a32f48de800fcd86ce1797850646dc0453e4f9b36a63b67da9a5c

See more details on using hashes here.

Provenance

The following attestation bundles were made for kafkars-0.0.3-cp310-abi3-macosx_10_12_x86_64.whl:

Publisher: ci.yaml on 0x26res/kafkars

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page