Create asynchronous data pipelines and deploy to cloud or airflow
Project description
Forum :wave: |
Installation :floppy_disk: |
Documentation :notebook:
️
️
Elegant YAML DAGS for Data Pipelines
Deploy to your existing Airflow.
️
Why Typhoon? |
Key Features |
Example YAML |
Installation
️
Why Typhoon (+ Airflow)?
Airflow is great!
Typhoon lets you write Airflow DAGS faster :rocket::
**Workflow**: Typhoon YAML DAG --> Typhoon build --> Airflow DAG
Simplicity and re-usability; a toolkit designed to be loved by Data Engineers :heart:
Key features
Elegant - YAML; low-code and easy to learn. Code-completion - Fast to compose. (VS Code recommended). Data sharing - data flows between tasks making it super intuitive. Composability - Functions and connections combine like Lego. |
|
Components - reduce complex tasks to 1 re-usable tasks Packaged examples:
UI: Share pre-built components (data pipelines) with your team :raised_hands: |
|
Rich CLI & Shell: Inspired by others; instantly familiar. Testable Tasks - automate DAG task tests. Testable Python - test functions or full DAGs with PyTest. |
Example YAML DAG
```yaml linenums="1"
name: favorite_authors
schedule_interval: rate(1 day)
tasks:
choose_favorites:
function: typhoon.flow_control.branch
args:
branches:
- J. K. Rowling
- George R. R. Martin
- James Clavell
get_author:
input: choose_favorites
function: functions.open_library_api.get_author
args:
author: !Py $BATCH
write_author_json:
input: get_author
function: typhoon.filesystem.write_data
args:
hook: !Hook data_lake
data: !MultiStep
- !Py $BATCH['docs']
- !Py typhoon.data.json_array_to_json_records($1)
path: !MultiStep
- !Py $BATCH['docs'][0]['key']
- !Py f'/authors/{$1}.json'
create_intermediate_dirs: True
```
Getting the works of my favorite authors from Open Library API
Installation
See documentation for detailed guidance on installation and walkthroughs.
with pip (typhoon standalone)
Install typhoon:
pip install typhoon-orchestrator[dev]
Optionally, install and activate virtualenv.
Then:
typhoon init hello_world
cd hello_world
typhoon status
This will create a directory named hello_world that serves as an example project. As in git, when we cd into the directory it will detect that it's a Typhoon project and consider that directory the base directory for Typhoon (TYPHOON_HOME).
Adding connnections
You can add a default connections as follows in the cli
typhoon connection add --conn-id data_lake --conn-env local
# Check that it was added
typhoon connection ls -l
With Docker and Airflow
To deploy Typhoon with Airflow you need:
- Docker / Docker Desktop (You must use WSL2 on Windows)
- Download the [docker-compose.yaml][1] (or use curl below)
- Create a directory for your TYPHOON_PROJECTS_HOME
The following sets up your project directory and gets the docker-compose.yml:
TYPHOON_PROJECTS_HOME="/tmp/typhoon_projects" # Or any other path you prefer
mkdir -p $TYPHOON_PROJECTS_HOME/typhoon_airflow_test
cd $TYPHOON_PROJECTS_HOME/typhoon_airflow_test
mkdir src
curl -LfO https://raw.githubusercontent.com/typhoon-data-org/typhoon-orchestrator/master/docker-compose-af.yml
docker compose -f docker-compose-af.yml up -d
docker exec -it typhoon-af bash # Then you're in the typhoon home.
airflow initdb # !! To initiate Airflow DB !!
typhoon status # To see status of dags & connections
typhoon dag build --all # Build the example DAGS
exit # exits docker
docker restart typhoon-af # Wait while docker restarts
This runs a container with only 1 service, typhoon-af
. This has both Airflow and Typhoon installed on it ready to work with.
You should be able to then check typhoon status
and also the airlfow UI at http://localhost:8088
Development hints are in the docs.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for typhoon-orchestrator-0.0.40.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6582bd433dde44e7a691f7d146ba03c75dfe2fa8a1ab45a5fac67574d46cc070 |
|
MD5 | 69d05c6d53739054f4c8e882f9a3dbdf |
|
BLAKE2b-256 | 1b41bfa5c76e4472d7cfd0d8f9b04be15f14d63da6cdc03db70355e01438c48d |
Hashes for typhoon_orchestrator-0.0.40-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | ab14a378fbc677087e9d7b810c662e4c5a6737012d8459323d16d6eb17d7a4ba |
|
MD5 | f77a00307b99371fbede3b611347364f |
|
BLAKE2b-256 | 61e337105a7416a005b8efa0cd0c51f86d394869a76feac617b96aae77ba2ee6 |