Skip to main content

Extract and load your data reliably from API Clients with native fault-tolerant and checkpointing mechanism.

Project description

bizon ⚡️

Extract and load your largest data streams with a framework you can trust for billion records.

Features

  • Natively fault-tolerant: Bizon uses a checkpointing mechanism to keep track of the progress and recover from the last checkpoint.
  • Queue system agnostic: Bizon is agnostic of the queuing system, you can use any queuing system like Python Queue, Kafka or Redpanda. Thanks to the bizon.queue.Queue interface, adapters can be written for any queuing system.
  • Pipeline metrics: Bizon provides exhaustive pipeline metrics and implement OpenTelemetry for tracing. You can monitor:
    • ETAs for completion
    • Number of records processed
    • Completion percentage
    • Latency Source <> Destination
  • Lightweight & lean: Bizon is lightweight, minimal codebase and only uses few dependencies:
    • requests for HTTP requests
    • pyyaml for configuration
    • sqlalchemy for database / warehouse connections
    • pyarrow for Parquet file format

Installation

pip install bizon

Usage

from yaml import safe_load
from bizon.engine.runner import RunnerFactory

yaml_config = """
source:
  source_name: dummy
  stream_name: creatures
  authentication:
    type: api_key
    params:
      token: dummy_key

destination:
  name: logger
  config:
    dummy: dummy
"""

config = safe_load(yaml_config)
runner = RunnerFactory.create_from_config_dict(config=config)
runner.run()

Backend configuration

Backend is the interface used by Bizon to store its state. It can be configured in the backend section of the configuration file. The following backends are supported:

  • sqlite: In-memory SQLite database, useful for testing and development.
  • biguquery: Google BigQuery backend, perfect for light setup & production.
  • postgres: PostgreSQL backend, for production use and frequent cursor updates.

Queue configuration

Queue is the interface used by Bizon to exchange data between Source and Destination. It can be configured in the queue section of the configuration file. The following queues are supported:

  • python_queue: Python Queue, useful for testing and development.
  • kafka: Apache Kafka, for production use and high throughput.

Start syncing your data 🚀

Quick setup without any dependencies ✌️

Queue configuration can be set to python_queue and backend configuration to sqlite. This will allow you to test the pipeline without any external dependencies.

Local Kafka setup

To test the pipeline with Kafka, you can use docker compose to setup Kafka or Redpanda locally.

Kafka

docker compose --file ./scripts/kafka-compose.yml up

In your YAML configuration, set the queue configuration to Kafka under engine:

engine:
  queue:
    type: kafka
    config:
      bootstrap_servers: localhost:9092

Redpanda

docker compose --file ./scripts/redpanda-compose.yml up

In your YAML configuration, set the queue configuration to Kafka under engine:

engine:
  queue:
    type: kafka
    config:
      bootstrap_servers: localhost:19092

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bizon-0.0.1.tar.gz (33.1 kB view details)

Uploaded Source

Built Distribution

bizon-0.0.1-py3-none-any.whl (55.3 kB view details)

Uploaded Python 3

File details

Details for the file bizon-0.0.1.tar.gz.

File metadata

  • Download URL: bizon-0.0.1.tar.gz
  • Upload date:
  • Size: 33.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.10.9 Darwin/23.5.0

File hashes

Hashes for bizon-0.0.1.tar.gz
Algorithm Hash digest
SHA256 75f5237d777bbf9f1c47c944628bd7f058fc1a26e26255fc2358930e25630aaf
MD5 aae9427b9a4ca99fb575ff9e8f31b64f
BLAKE2b-256 05e00095c5780a19aa450932e2db74b878f3ddf5398b5de7b32369384f192bef

See more details on using hashes here.

File details

Details for the file bizon-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: bizon-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 55.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.10.9 Darwin/23.5.0

File hashes

Hashes for bizon-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 188e193e05c1b9c16dc6e4f6d4ff925d176148b272c246763636640d840ea833
MD5 d60acb62f8091e34d9ec0649748abc69
BLAKE2b-256 fca1eae8529c37c78b359ab181e21a66c8699f49031fac2d54354d2362a0086e

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page