Kafka client library based on confluent-kafka with retry functionality
Project description
Retriable Kafka Client
This is an opinionated wrapper for confluent_kafka Python library.
It only uses specific parts of the underlying library while introducing
additional functionalities to these selected parts.
Features
The aim is to provide a fault-tolerant platform for parallel message processing.
Parallel processing
Each consumer requires an executor pool, which will be used for message processing. Each consumer can consume from multiple topics and processes the messages from these topics by a single callable. The callable must be specified by the user of this library.
The library also ensures exactly-once processing when used correctly. To ensure this, the tasks should take short enough time that all of them finish before the cluster forces rebalancing. The library tries to finish tasks from revoked partitions before the rebalance while stopping additional non-started tasks. The default timeout for each task to finish is 30 seconds, but can be changed. Kafka cluster behavior change may also be needed with longer tasks. This behavior only appears during rebalancing and graceful stopping of the consumer.
Fault-tolerance
Each consumer accepts configuration with retry topics. A retry topic is a Kafka topic used for asynchronous retrying of message processing. If the specified target callable fails, the consumer will commit the original message and resends the same message to a retry topic with special headers. The headers include information about the next timestamp at which the message should be processed again (to give some time to the error to disappear if the processing depends on some outside infrastructure).
The retry topic is polled alongside the original topic. If a message contains the special timestamp header, its Kafka partition of origin will be paused and the message will be stored locally. The processing will resume only after the specified timestamp passes. The message will not be processed before the timestamp, it can only gather delay (depending on the occupation of the worker pool). Once the message is sent to the pool for re-processing, the consumption of the blocked partition is resumed.
This whole mechanism does not ensure message ordering. When a message is sent to be retried, another message processing from the same topic is still unblocked.
Message splitting
This library also supports sending large messages to topics which don't have the capacity to process these messages as whole. To do this, producers are configurable to automatically split messages according to message size.
This feature is custom and is therefore turned off by default. The only place this feature is always enabled are retry-topics, which are meant to be consumed only by clients using this library.
The chunked messages have 3 additional headers:
- Group ID (uuid4 value)
- Chunk ID (serial number of the chunk within group, starting with 0)
- Number of chunks (is always +1 from the last chunk ID)
Message is deserialized and processed only if all expected chunks have been found.
Contributing guidelines
To check contributing guidelines, please check CONTRIBUTING.md in the
GitHub repository.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file retriable_kafka_client-0.6.1.tar.gz.
File metadata
- Download URL: retriable_kafka_client-0.6.1.tar.gz
- Upload date:
- Size: 26.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.11.6 {"installer":{"name":"uv","version":"0.11.6","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c43ace7df7a9b0db4d769fba5813eb601699439494116597d54621e501352c15
|
|
| MD5 |
2f50302b39550d92b0d47b3c11e53996
|
|
| BLAKE2b-256 |
89931607362366a60b895d82ca5c57a213d3c0f6e5aea8ad5fcf1ed988ae1161
|
File details
Details for the file retriable_kafka_client-0.6.1-py3-none-any.whl.
File metadata
- Download URL: retriable_kafka_client-0.6.1-py3-none-any.whl
- Upload date:
- Size: 31.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.11.6 {"installer":{"name":"uv","version":"0.11.6","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
76fb1b1923e317a3963c40d90b28a53b75e6417a9eb483c7df4357dee22b464a
|
|
| MD5 |
4760d23173ade5e3cd0305b8db059eb4
|
|
| BLAKE2b-256 |
3e8226e45a95a8e1db0d4f01ff7db8299191f46c0e9e9c55e7221e6119c76daa
|