Skip to main content

Rust based, Arrow powered, python kafka client

Project description

kafkars

PyPI Downloads Documentation Python License GitHub stars GitHub repo size GitHub issues PRs Welcome Rust Apache Arrow Ruff

Rust-based, Arrow-powered Python Kafka client for high-throughput data pipelines.

Motivation

Python's Global Interpreter Lock (GIL) and memory management create bottlenecks when consuming high-volume Kafka streams. Traditional Python Kafka clients process messages one at a time, requiring serialization/deserialization overhead for each message and limiting throughput.

kafkars solves this by:

  • Rust core: All Kafka operations (polling, buffering, ordering) happen in Rust, bypassing the GIL
  • Batch processing: Messages are accumulated and returned as Apache Arrow RecordBatches, not individual Python objects
  • Zero-copy where possible: Arrow's columnar format enables efficient data transfer between Rust and Python
  • Vectorized operations: Process thousands of messages at once with pandas, polars, or any Arrow-compatible library

This architecture is ideal for:

  • Real-time analytics pipelines
  • ML feature stores consuming from Kafka
  • High-volume event processing
  • Data lake ingestion

Important: Analytics-Focused Design

kafkars does not commit offsets. It is designed for analytics and high-throughput batch processing, not transactional workloads.

  • No exactly-once semantics: Messages may be reprocessed if your application restarts
  • No offset tracking: You control where to start reading via offset policies
  • Stateless consumers: Each consumer instance starts fresh based on the configured policy

If you need exactly-once processing, transactional guarantees, or automatic offset management, use a traditional Kafka client like confluent-kafka-python.

Features

  • Ordered delivery: Messages released in timestamp order across all partitions
  • Flexible offset policies: Start from earliest, latest, or any timestamp
  • Backpressure management: Automatically pauses fast partitions to prevent memory overflow
  • Arrow-native output: Returns PyArrow RecordBatch for efficient downstream processing

Installation

pip install kafkars

Quick Start

import time
from kafkars import ConsumerManager, SourceTopic

# Define source topics with offset policies
topics = [
    SourceTopic.from_earliest("events"),
    SourceTopic.from_relative_time("metrics", 3600_000),  # 1 hour ago
]

# Create consumer
manager = ConsumerManager(
    config={
        "bootstrap.servers": "localhost:9092",
        "group.id": "my-consumer-group",
    },
    topics=topics,
    cutoff_ms=int(time.time() * 1000),
    batch_size=10_000,
)

# Poll returns PyArrow RecordBatch
while True:
    batch = manager.poll(timeout_ms=1000)
    if batch.num_rows > 0:
        # Convert to pandas/polars for processing
        df = batch.to_pandas()
        print(f"Received {len(df)} messages")

    if manager.is_live():
        print("Caught up to real-time")
        break

Message Schema

Each poll returns a RecordBatch with the following schema:

Column Type Description
key binary Message key (nullable)
value binary Message payload (nullable)
topic utf8 Source topic name
partition int32 Partition number
offset int64 Message offset
timestamp timestamp[ms, UTC] Message timestamp

Offset Policies

Policy Description
from_earliest(topic) Start from the beginning
from_latest(topic) Start from the end (new messages only)
from_relative_time(topic, ms) Start from N milliseconds ago
from_absolute_time(topic, ms) Start from specific Unix timestamp

Documentation

Full documentation is available at kafkars.readthedocs.io.

License

Apache 2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kafkars-0.0.4.tar.gz (114.1 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

kafkars-0.0.4-cp313-cp313t-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.7 MB view details)

Uploaded CPython 3.13tmanylinux: glibc 2.17+ x86-64

kafkars-0.0.4-cp313-cp313t-macosx_11_0_arm64.whl (1.1 MB view details)

Uploaded CPython 3.13tmacOS 11.0+ ARM64

kafkars-0.0.4-cp313-cp313t-macosx_10_12_x86_64.whl (545.0 kB view details)

Uploaded CPython 3.13tmacOS 10.12+ x86-64

kafkars-0.0.4-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.7 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ x86-64

kafkars-0.0.4-cp310-abi3-macosx_11_0_arm64.whl (1.1 MB view details)

Uploaded CPython 3.10+macOS 11.0+ ARM64

kafkars-0.0.4-cp310-abi3-macosx_10_12_x86_64.whl (551.1 kB view details)

Uploaded CPython 3.10+macOS 10.12+ x86-64

File details

Details for the file kafkars-0.0.4.tar.gz.

File metadata

  • Download URL: kafkars-0.0.4.tar.gz
  • Upload date:
  • Size: 114.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for kafkars-0.0.4.tar.gz
Algorithm Hash digest
SHA256 75b98c24ef23088eaa89de16f40a11dabee0ccea96a654bfa1c9b08bb03cc334
MD5 2bae87b71b7005055aa96ee0519841a0
BLAKE2b-256 3458c950071c280c34445d9d4367ed22506686bb46ccce3a43055e237554cd70

See more details on using hashes here.

Provenance

The following attestation bundles were made for kafkars-0.0.4.tar.gz:

Publisher: ci.yaml on 0x26res/kafkars

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kafkars-0.0.4-cp313-cp313t-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for kafkars-0.0.4-cp313-cp313t-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 7332ebf710ad0e4b3c4679a9c0f17820290c82e94f8b98422bdd8067debe4dc5
MD5 ea8fc4ca92a6ec28833ecb0217fb68dd
BLAKE2b-256 b199aebca0ce8a6aace966d637c6689ff1290fa6d3fae5359b46f64b09cffc93

See more details on using hashes here.

Provenance

The following attestation bundles were made for kafkars-0.0.4-cp313-cp313t-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: ci.yaml on 0x26res/kafkars

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kafkars-0.0.4-cp313-cp313t-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for kafkars-0.0.4-cp313-cp313t-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 8bd8711f4a3a974f5d752988aad1912394f2bd7960f1831e0f36714cd5eb2cf3
MD5 d512cfbcb69d50b63d559622e9933efc
BLAKE2b-256 337ddb14e0b30c9adef19ab2cdb1912b7325388f8a40a35c1344bdc533cf0c04

See more details on using hashes here.

Provenance

The following attestation bundles were made for kafkars-0.0.4-cp313-cp313t-macosx_11_0_arm64.whl:

Publisher: ci.yaml on 0x26res/kafkars

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kafkars-0.0.4-cp313-cp313t-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for kafkars-0.0.4-cp313-cp313t-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 76c138db593570fc3b5fc4e3637e1e47927ee4867e5d978d2997aad7a671e498
MD5 011b496c8273922a66f40b229072675b
BLAKE2b-256 9d8426fcd36265755134354387d99e8839864fd6914ba3706cd1574bc3202b7e

See more details on using hashes here.

Provenance

The following attestation bundles were made for kafkars-0.0.4-cp313-cp313t-macosx_10_12_x86_64.whl:

Publisher: ci.yaml on 0x26res/kafkars

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kafkars-0.0.4-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for kafkars-0.0.4-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 8329bf7591dc57efb7ff79176c45c56a568f9b1c735f5ce5e50e87bb9665fea1
MD5 e2d881a54aceff2d4c7b2205417629fb
BLAKE2b-256 98440a71d18b1d4bde34260bf4983cb13391696574100b823efcf7fec3d339e8

See more details on using hashes here.

Provenance

The following attestation bundles were made for kafkars-0.0.4-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: ci.yaml on 0x26res/kafkars

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kafkars-0.0.4-cp310-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for kafkars-0.0.4-cp310-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 cd9e0d1fda7bfefa1eb629fdd2cc823719db5bde8b696a3a86330a1dd732eab7
MD5 82596b6c9f4aff7633b74f0731357ce2
BLAKE2b-256 39a3684b317587ddb12198fa3e494fb292e25291e66b261e5ead582dc3bef748

See more details on using hashes here.

Provenance

The following attestation bundles were made for kafkars-0.0.4-cp310-abi3-macosx_11_0_arm64.whl:

Publisher: ci.yaml on 0x26res/kafkars

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kafkars-0.0.4-cp310-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for kafkars-0.0.4-cp310-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 81e8b1b910b85ece07bacd006279c4b14452009296f60122733b15f64f731747
MD5 5bf43affd2f04fcd9413d34200547024
BLAKE2b-256 f0fc3c85e48dbd65d68a2a6e515bd43578bb73c7bcea809e49c4163829a4c701

See more details on using hashes here.

Provenance

The following attestation bundles were made for kafkars-0.0.4-cp310-abi3-macosx_10_12_x86_64.whl:

Publisher: ci.yaml on 0x26res/kafkars

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page