An Apache Airflow provider for Bacalhau.
Project description
Apache Airflow Provider for Bacalhau
This is the bacalhau-airflow
, a python package including an Apache Airflow provider.
What's a provider???
Features
- Create Airflow tasks that run on Bacalhau (via custom operator!)
- Support for sharded jobs: output shards can be passed downstream (via XComs)
- Coming soon...
- Lineage (OpenLineage)
- Various working code examples
- Hosting instructions
Requirements
- Python 3.8+
bacalhau-sdk
XXapache-airflow
+2.4
Installation
From pypi
pip install bacalhau-airflow
From source
Clone the public repository:
$ git clone https://github.com/bacalhau-project/bacalhau/
Once you have a copy of the source, you can install it with:
$ cd integration/airflow/
$ pip install .
Usage (WIP :warning:)
AIRFLOW_VERSION=2.4.1
PYTHON_VERSION="$(python --version | cut -d " " -f 2 | cut -d "." -f 1-2)"
CONSTRAINT_URL="https://raw.githubusercontent.com/apache/airflow/constraints-${AIRFLOW_VERSION}/constraints-${PYTHON_VERSION}.txt"
pip install "apache-airflow==${AIRFLOW_VERSION}" --constraint "${CONSTRAINT_URL}"
export AIRFLOW_HOME=~/airflow
airflow db init
export AIRFLOW_HOME=~/airflow
airflow standalone
Development
$ pip install -r dev-requirements.txt
Unit tests
$ tox
You can also skip using tox
and run pytest
on your own dev environment.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
bacalhau_airflow-0.0.3.tar.gz
(14.8 kB
view hashes)
Built Distribution
Close
Hashes for bacalhau_airflow-0.0.3-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | bd945c5436e70b92e8626c237ee76fbf660438e6c0e7a5b190b92069d98858bb |
|
MD5 | ba1d14e31e3aa1aa7463cbc40ba1ced3 |
|
BLAKE2b-256 | 60ffae0f33a9fbb2b0e9803e95b87ba211b6606d3bfad56aef3ae4285ec0b4d0 |