Spark stream consumption commit in kafka consumer group
Project description
freeza-offset
What is it?
freeza-offset is a Python package that provides a simple way to commit the offset consumed by Spark Streaming in Kafka's ConsumerGroup.
Main Features
Here are just a few of the things that freeza-offset does well:
- Commits the offset consumed in kafka
- Tracking Spark consumption lag at Kafka
- The offset is not just in control of the spark
Where to get it
The source code is currently hosted on GitHub at: https://github.com/HashLoad/freeza-offset
Binary installers for the latest released version are available at the Python package index and on conda.
# conda
conda install freeza-offset
# PyPI
pip install freeza-offset
# Databricks
dbutils.library.installPyPI("freeza-offset")
Dependencies
Installation from sources
In the freeza-offset directory (same one where you found this file after
cloning the git repo), execute:
python setup.py install
Example:
pip install freeza-offset
from pyspark.sql import SparkSession
spark = SparkSession \
.builder \
.appName("FreezaCommitTest") \
.getOrCreate()
df = spark \
.readStream \
.format("kafka") \
.option("kafka.bootstrap.servers", "kafka1:9092,kafka2:9092,kafka3:9092") \
.option("subscribe", "topic-name") \
.option("startingOffsets", "earliest") \
.option("kafka.group.id", "spark-freeza-runner") \
.load()
df.selectExpr("key", "value")
qry = df.writeStream \
.format("console") \
.option("truncate","false") \
.start()
import freeza
tr = freeza.start_commiter_thread(
query=qry,
bootstrap_servers=bootstrap_servers,
group_id="spark-freeza-commiter"
)
tr.isAlive()
Getting Help
For usage questions, the best place to go to is open new issue
Contributing to freeza-offset
All contributions, bug reports, bug fixes, documentation improvements, enhancements, and ideas are welcome.
License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file freeza-offset-1.0.10.tar.gz.
File metadata
- Download URL: freeza-offset-1.0.10.tar.gz
- Upload date:
- Size: 3.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.1.0 requests-toolbelt/0.9.1 tqdm/4.47.0 CPython/3.8.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
131d5e2ed99db2dd5e758cd5c4cbd7289f10099754d15dbc0808a073f19ff052
|
|
| MD5 |
e67bf2edf079ad0ae7c3b2fbcdd63f65
|
|
| BLAKE2b-256 |
3f1d2a121a1c202b13967eb555a0543bd77609bee3207f19d3269977c72eb842
|
File details
Details for the file freeza_offset-1.0.10-py3-none-any.whl.
File metadata
- Download URL: freeza_offset-1.0.10-py3-none-any.whl
- Upload date:
- Size: 4.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.1.0 requests-toolbelt/0.9.1 tqdm/4.47.0 CPython/3.8.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ae65a9001299992c5fd5438ef0d6723e37b6df12a438fce8c2c002cdfb89c31b
|
|
| MD5 |
8da9ef393df4987c166445a1427a1620
|
|
| BLAKE2b-256 |
5dd333c4c38c2b36354e48134976cb0f62b02e8dca4175d11e0b25ffdeae6c2b
|