Extract and load your data reliably from API Clients with native fault-tolerant and checkpointing mechanism.
Project description
bizon ⚡️
Extract and load your largest data streams with a framework you can trust for billion records.
Features
- Natively fault-tolerant: Bizon uses a checkpointing mechanism to keep track of the progress and recover from the last checkpoint.
- Queue system agnostic: Bizon is agnostic of the queuing system, you can use any queuing system like Python Queue, Kafka or Redpanda. Thanks to the
bizon.queue.Queue
interface, adapters can be written for any queuing system. - Pipeline metrics: Bizon provides exhaustive pipeline metrics and implement OpenTelemetry for tracing. You can monitor:
- ETAs for completion
- Number of records processed
- Completion percentage
- Latency Source <> Destination
- Lightweight & lean: Bizon is lightweight, minimal codebase and only uses few dependencies:
requests
for HTTP requestspyyaml
for configurationsqlalchemy
for database / warehouse connectionspyarrow
for Parquet file format
Installation
pip install bizon
Usage
from yaml import safe_load
from bizon.engine.runner import RunnerFactory
yaml_config = """
source:
source_name: dummy
stream_name: creatures
authentication:
type: api_key
params:
token: dummy_key
destination:
name: logger
config:
dummy: dummy
"""
config = safe_load(yaml_config)
runner = RunnerFactory.create_from_config_dict(config=config)
runner.run()
Backend configuration
Backend is the interface used by Bizon to store its state. It can be configured in the backend
section of the configuration file. The following backends are supported:
sqlite
: In-memory SQLite database, useful for testing and development.biguquery
: Google BigQuery backend, perfect for light setup & production.postgres
: PostgreSQL backend, for production use and frequent cursor updates.
Queue configuration
Queue is the interface used by Bizon to exchange data between Source
and Destination
. It can be configured in the queue
section of the configuration file. The following queues are supported:
python_queue
: Python Queue, useful for testing and development.kafka
: Apache Kafka, for production use and high throughput.
Start syncing your data 🚀
Quick setup without any dependencies ✌️
Queue configuration can be set to python_queue
and backend configuration to sqlite
.
This will allow you to test the pipeline without any external dependencies.
Local Kafka setup
To test the pipeline with Kafka, you can use docker compose
to setup Kafka or Redpanda locally.
Kafka
docker compose --file ./scripts/kafka-compose.yml up # Kafka
docker compose --file ./scripts/redpanda-compose.yml up # Redpanda
In your YAML configuration, set the queue
configuration to Kafka under engine
:
engine:
queue:
type: kafka
config:
queue: bootstrap_server: localhost:9092 # Kafka:9092 & Redpanda: 19092
RabbitMQ
docker compose --file ./scripts/rabbitmq-compose.yml up
In your YAML configuration, set the queue
configuration to Kafka under engine
:
engine:
queue:
type: rabbitmq
config:
queue:
host: localhost
queue_name: bizon
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file bizon-0.0.3.dev1.tar.gz
.
File metadata
- Download URL: bizon-0.0.3.dev1.tar.gz
- Upload date:
- Size: 42.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.3 CPython/3.10.9 Darwin/23.6.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6de74d59cce8a92a9a57f9d8e05defe01243d832f09b7cbb1d9939801c014a93 |
|
MD5 | 5c9738d5f30ce4b9bcae780a6225e125 |
|
BLAKE2b-256 | f199d7764d364b7726c6f12a0f225993dc32aec0f444d6e19db3000f9c2ebf33 |
File details
Details for the file bizon-0.0.3.dev1-py3-none-any.whl
.
File metadata
- Download URL: bizon-0.0.3.dev1-py3-none-any.whl
- Upload date:
- Size: 71.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.3 CPython/3.10.9 Darwin/23.6.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | ea60a4da373e8b755c4a56e65daeb4fd9b6edff140003976b44001971ff85c66 |
|
MD5 | 8390b27193aa2bf329ce17ac4f2b85be |
|
BLAKE2b-256 | 11f7e7697f1dd58edc3b090c1990fd63cbf896a7cec02b9b09a2357c6708f2e9 |