Skip to main content

Spark stream consumption commit in kafka consumer group

Project description

freeza-offset

freeza-offset

What is it?

freeza-offset is a Python package that provides a simple way to commit the offset consumed by Spark Streaming in Kafka's ConsumerGroup.

Main Features

Here are just a few of the things that freeza-offset does well:

  • Commits the offset consumed in kafka
  • Tracking Spark consumption lag at Kafka
  • The offset is not just in control of the spark

Where to get it

The source code is currently hosted on GitHub at: https://github.com/HashLoad/freeza-offset

Binary installers for the latest released version are available at the Python package index and on conda.

# conda
conda install freeza-offset
# PyPI
pip install freeza-offset
# Databricks
dbutils.library.installPyPI("freeza-offset")

Dependencies

Installation from sources

In the freeza-offset directory (same one where you found this file after cloning the git repo), execute:

python setup.py install

Example:

pip install freeza-offset
from pyspark.sql import SparkSession

spark = SparkSession \
    .builder \
    .appName("FreezaCommitTest") \
    .getOrCreate()
df = spark \
  .readStream \
  .format("kafka") \
  .option("kafka.bootstrap.servers", "kafka1:9092,kafka2:9092,kafka3:9092") \
  .option("subscribe", "topic-name") \
  .option("startingOffsets", "earliest") \
  .option("kafka.group.id", "spark-freeza-runner") \
  .load()
df.selectExpr("key", "value")
qry = df.writeStream \
    .format("console") \
    .option("truncate","false") \
    .start()
import freeza
tr = freeza.start_commiter_thread(
    query=qry,
    bootstrap_servers=bootstrap_servers,
    group_id="spark-freeza-commiter"
)
tr.isAlive()

Getting Help

For usage questions, the best place to go to is open new issue

Contributing to freeza-offset

All contributions, bug reports, bug fixes, documentation improvements, enhancements, and ideas are welcome.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

freeza-offset-1.0.10.tar.gz (3.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

freeza_offset-1.0.10-py3-none-any.whl (4.4 kB view details)

Uploaded Python 3

File details

Details for the file freeza-offset-1.0.10.tar.gz.

File metadata

  • Download URL: freeza-offset-1.0.10.tar.gz
  • Upload date:
  • Size: 3.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.1.0 requests-toolbelt/0.9.1 tqdm/4.47.0 CPython/3.8.2

File hashes

Hashes for freeza-offset-1.0.10.tar.gz
Algorithm Hash digest
SHA256 131d5e2ed99db2dd5e758cd5c4cbd7289f10099754d15dbc0808a073f19ff052
MD5 e67bf2edf079ad0ae7c3b2fbcdd63f65
BLAKE2b-256 3f1d2a121a1c202b13967eb555a0543bd77609bee3207f19d3269977c72eb842

See more details on using hashes here.

File details

Details for the file freeza_offset-1.0.10-py3-none-any.whl.

File metadata

  • Download URL: freeza_offset-1.0.10-py3-none-any.whl
  • Upload date:
  • Size: 4.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.1.0 requests-toolbelt/0.9.1 tqdm/4.47.0 CPython/3.8.2

File hashes

Hashes for freeza_offset-1.0.10-py3-none-any.whl
Algorithm Hash digest
SHA256 ae65a9001299992c5fd5438ef0d6723e37b6df12a438fce8c2c002cdfb89c31b
MD5 8da9ef393df4987c166445a1427a1620
BLAKE2b-256 5dd333c4c38c2b36354e48134976cb0f62b02e8dca4175d11e0b25ffdeae6c2b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page