Skip to main content

PDP Kafka package

Project description

PDP Kafka Reader

Requirements

  • git
  • python 3.6+
  • pip

Also, you need access to the git repository. Generate and use ssh keys in Skyway Bitbucket.

Install

pip install pdp_kafka_reader

Usage

CLI

You can use kafka-reader CLI tool to extract data into from a specific topic. An example of usage:

kafka-reader export-avro -k kafka-options.json -s schema.json -t my_kafka_topic -o out.parquet

Check all options with kafka-reader -h.

Python KafkaReader

import json

from pdp_kafka_reader.kafka_reader import KafkaAvroReader

kafka_options = {
    "kafka.bootstrap.servers": "my-kafka-server:9092",
    "subscribe": "test_avro"
}

avro_schema = open("schema.json").read()

reader = KafkaAvroReader(spark)
df = reader.read_avro(kafka_options, avro_schema, "my_kafka_topic")
df.show()

Testing

Testing environment in defined in docker-compose.yml. Start docker containers and run tox.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pdp-kafka-reader-0.0.5.tar.gz (171.8 kB view details)

Uploaded Source

Built Distribution

pdp_kafka_reader-0.0.5-py3-none-any.whl (172.6 kB view details)

Uploaded Python 3

File details

Details for the file pdp-kafka-reader-0.0.5.tar.gz.

File metadata

  • Download URL: pdp-kafka-reader-0.0.5.tar.gz
  • Upload date:
  • Size: 171.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.8.3 pkginfo/1.8.2 requests/2.27.1 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.6.10

File hashes

Hashes for pdp-kafka-reader-0.0.5.tar.gz
Algorithm Hash digest
SHA256 946caf46416a26f2ef76f2e81f179d20a07fe1fafee5f7af765e338334fe27df
MD5 6df61518155f575add14aba1287f4284
BLAKE2b-256 e1e78ff32ec6c2dff7d13302b7dfa0532fbf616d1308ab00d0becc4b9336cb59

See more details on using hashes here.

File details

Details for the file pdp_kafka_reader-0.0.5-py3-none-any.whl.

File metadata

  • Download URL: pdp_kafka_reader-0.0.5-py3-none-any.whl
  • Upload date:
  • Size: 172.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.8.3 pkginfo/1.8.2 requests/2.27.1 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.6.10

File hashes

Hashes for pdp_kafka_reader-0.0.5-py3-none-any.whl
Algorithm Hash digest
SHA256 2e7cc44cd30ec463a2a03690520c817c94eee10c2ca60f275c062fb528f2f2c2
MD5 557268b5839ce2519397158a07c403e9
BLAKE2b-256 a0a2ee690016332e9b2eacd70db3069d12b5cb47fa941551136fe834d959c1a0

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page