Stream each new rows of a file and write in kafka

These details have not been verified by PyPI

Project links

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Development Status
- 4 - Beta
Environment
- Console
Intended Audience
- Developers
License
- OSI Approved :: GNU General Public License v3 (GPLv3)
Operating System
Programming Language
- Python

Project description

strefi

Python style

Stream each new rows of a file and write in kafka.

Installation
Usage
Configuration
License

Installation

PyPi

pip install strefi

Git

git clone https://github.com/VictorMeyer77/strefi.git
cd strefi
make virtualenv # if you want create new environment
source .venv/bin/activate # if you want activate the new environment
make install

Usage

Read the complete example for more details.

usage: strefi [-h] [-c CONFIG] [-i JOBID] [-l LOG] command

Stream each new rows of a file and write in kafka

positional arguments:
  command               "start" to launch stream or "stop" to kill stream

options:
  -h, --help            show this help message and exit
  -c CONFIG, --config CONFIG
                        configuration file path
  -i JOBID, --jobid JOBID
                        stream id
  -l LOG, --log LOG     log configuration file path (configparser file format)

Launch job

strefi start -c config.json

Stop a job

strefi stop -i {job_id}

Stop all jobs

strefi stop -i all

List jobs status

strefi ls

Configuration

Strefi configuration is stored in a simple json file.

{
   "producer":{
      "bootstrap_servers":"localhost:9092",
      "acks":0,
      "retries":0
   },
   "headers":{
      "version":"0.1",
      "type":"json"
   },
   "defaults":{
      "key_one":"value_one",
      "key_two":"value_two"
   },
   "files":{
      "/path/to/file_1":"target_topic",
      "/path/to/file_2":"target_topic"
   }
}

files

Specify in the "files" objects the paths of all files you want stream. The field key is file path and the field value is the topic.

"files":{
  "/path/to/file_1":"target_topic",
  "/path/to/file_2":"target_topic"
}

producer

Producer configuration must have at least the field boostrap_servers. All fields will be parameters of the KafkaProducer. One producer is created foreach table to stream, but all producers have the same configuration.

"producer":{
  "bootstrap_servers":"localhost:9092",
  "acks":0,
  "retries":0
}

defaults

This field can be empty. In this case, the record sent to kafka is just composed with streamed file path and the file row.

{"file": "/path/to/file_1", "row": "last file row"}

You can enhance records send to topic with th "defaults" object.

"defaults":{
  "key_one":"value_one",
  "key_two":"value_two"
}

With this configuration, the record sent to kafka has also these values.

{"file":"/path/to/file_1", "row":"last file row", "key_one":"value_one","key_two":"value_two"}

headers

You can join headers with the record with the "headers" field. It can be empty if you don't want headers.

"headers":{
  "version":"0.1",
  "type":"json"
}

These headers will be converted in this list of tuple. Headers key shall be a string and the value will be encoded.

[("version", b"0.1"), ("type", b"json")]

License

strefi is released under GPL-3.0 license. See LICENSE.

Project details

These details have not been verified by PyPI

Project links

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Development Status
- 4 - Beta
Environment
- Console
Intended Audience
- Developers
License
- OSI Approved :: GNU General Public License v3 (GPLv3)
Operating System
Programming Language
- Python

Release history Release notifications | RSS feed

This version

0.2.0

Nov 10, 2023

0.1.0

Oct 23, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

strefi-0.2.0.tar.gz (26.2 kB view hashes)

Uploaded Nov 10, 2023 Source

Built Distribution

strefi-0.2.0-py3-none-any.whl (9.5 kB view hashes)

Uploaded Nov 10, 2023 Python 3

Hashes for strefi-0.2.0.tar.gz

Hashes for strefi-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`4ef55f879a0a6d3b3ab81f59bfbb9dbb6eaa683da966e9eb038ec29d5babf042`
MD5	`9c934c3e00a1329d04b6f19050b20162`
BLAKE2b-256	`e68479b9950fc61eddfcefadcf65fc1923b0dee45775d1fde435deecde2286e6`

Hashes for strefi-0.2.0-py3-none-any.whl

Hashes for strefi-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`2786c071c3c799934e9bcef37556f22a223057ecf0105652300f4cf431980a22`
MD5	`207f5e2967ec02a445ca4b8a11d411fb`
BLAKE2b-256	`feadb403f8d0e28d8b17ba2309694b8700d16d74d646b6e3fc39bbbdfe72b113`