Skip to main content

Create asynchronous data pipelines and deploy to cloud or airflow

Project description





Website :loudspeaker: | Discord :sunglasses: | Forum :wave: | Installation :floppy_disk: | Documentation :notebook:

Linux

Why Typhoon?

Our vision is a new generation of cloud native, asynchronous orchestrators that can handle highly dynamic workflows with ease. We crafted Typhoon from the ground up to work towards this vision. It's designed to feel familiar while still making very different design decisions where it matters.

Typhoon overview montage

Typhoon + AWS Lambda

A Serverless orchestrator has the potential to be infinitely scalable and extremely cost efficient at the same time. We think AWS Lambda is ideal for this:

  • CloudWatch Events can trigger a Lambda on a schedule, so we get scheduling for free! A scheduler is the most complex piece of an orchestrator. We can do away with it completely and still be sure that our DAGs will always run on time.
  • Lambda is cheap. You get 1 million invocations for free every month.
  • Workflows can be paralellized by running tasks in parallel on different instances of the Lambda. Typhoon DAGs use batching to take full advantage of this.

Typhoon + Airflow

Airflow is great!

It's also the industry standard and will be around for a while. However, we think it can be improved, without even migrating your existing production code.

Typhoon lets you write Airflow DAGS faster :rocket::

**Workflow**: Typhoon YAML DAG --> Typhoon build --> Airflow DAG 

Simplicity and re-usability; a toolkit designed to be loved by Data Engineers :heart:

Key features

  • Pure python - Easily extend with pure python. Frameworkless, with no dependencies.
  • Testable Python - Write tests for your tasks in PyTest. Automate DAG testing.
  • Composability - Functions and connections combine like Lego. Very easy to extend.
  • Data sharing - data flows between tasks making it intuitive to build tasks.
  • Elegant: YAML - low-code and easy to learn.
  • Code-completion - Fast to compose. (VS Code recommended).
  • Components - reduce complex tasks (e.g. CSV → S3 → Snowflake) to 1 re-usable task.
  • Components UI - Share your pre-built automation with your team. teams. :raised_hands:
  • Rich Cli & Shell - Inspired by other great command line interfaces and instantly familiar. Intelligent bash/zsh completion.
  • Flexible deployment - Deploy to Airflow. Large reduction in effort, without breaking existing production.

Example YAML DAG

name: favorite_authors
schedule_interval: rate(1 day)

tasks:
  choose_favorites:
    function: typhoon.flow_control.branch
    args:
      branches:
        - J. K. Rowling
        - George R. R. Martin
        - James Clavell

  get_author:
    input: choose_favorites
    function: functions.open_library_api.get_author
    args:
      author: !Py $BATCH

  write_author_json:
    input: get_author
    function: typhoon.filesystem.write_data    
    args:
      hook: !Hook data_lake
      data:  !MultiStep
        - !Py $BATCH['docs']
        - !Py typhoon.data.json_array_to_json_records($1)
      path: !MultiStep 
        - !Py $BATCH['docs'][0]['key']
        - !Py f'/authors/{$1}.json'
      create_intermediate_dirs: True

Favorite Authors Getting the works of my favorite authors from Open Library API

⚡ Installation

See documentation for more extensive installation instructions and walkthroughs.

with pip (typhoon standalone)

Install typhoon:

pip install typhoon-orchestrator[dev]

# Create a project
typhoon init hello_world

# Try the Cli
cd hello_world
typhoon status

# Add your connection
typhoon connection add --conn-id data_lake --conn-env local
typhoon connection ls -l

Docs: Detailed local installation instructions. | Hello world.

With Docker and Airflow

To deploy Typhoon with Airflow you need:

  • Docker / Docker Desktop (For now, you must use Gitbash on Windows. Currently, there is an open issue on WSL2.)
  • Download the [docker-compose.yaml][1] (or use curl below)
  • Create a directory for your TYPHOON_PROJECTS_HOME

The following sets up your project directory and gets the docker-compose.yml:

TYPHOON_PROJECTS_HOME="/tmp/typhoon_projects" # Or any other path you prefer
mkdir -p $TYPHOON_PROJECTS_HOME/typhoon_airflow_test
cd $TYPHOON_PROJECTS_HOME/typhoon_airflow_test

# For Windows WSL2 Users - for other env. its optional 
sudo chown -R $USER: $TYPHOON_PROJECTS_HOME/typhoon_airflow_test
mkdir airflow
mkdir data_lake
mkdir src

curl -LfO https://raw.githubusercontent.com/typhoon-data-org/typhoon-orchestrator/master/docker-compose-af.yml

!!! Important On Windows Gitbash please run each docker-compose run one by one. They are quick.

docker-compose -f docker-compose-af.yml run --rm typhoon-af airflow initdb
docker-compose -f docker-compose-af.yml run --rm typhoon-af typhoon status
docker-compose -f docker-compose-af.yml run --rm typhoon-af typhoon connection add --conn-id data_lake --conn-env local  # Adding our first connection!
docker-compose -f docker-compose-af.yml run --rm typhoon-af typhoon dag build --all
docker compose -f docker-compose-af.yml up -d

This runs a container with only 1 service, typhoon-af. This has both Airflow and Typhoon installed on it ready to work with.

You should be able to then check typhoon status and also the airlfow UI at http://localhost:8088

Docs: Detailed docker installation instructions. | Development hints.


![Airflow UI](docs/img/airflow_ui_list_after_install.png) *Typhoon DAGS listed in airflow UI*

Airflow Favorite Author Favorite Authors DAG - as displayed in airflow UI

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

typhoon-orchestrator-0.0.58.tar.gz (134.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

typhoon_orchestrator-0.0.58-py3-none-any.whl (185.7 kB view details)

Uploaded Python 3

File details

Details for the file typhoon-orchestrator-0.0.58.tar.gz.

File metadata

  • Download URL: typhoon-orchestrator-0.0.58.tar.gz
  • Upload date:
  • Size: 134.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.6.0 importlib_metadata/4.8.2 pkginfo/1.8.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.5

File hashes

Hashes for typhoon-orchestrator-0.0.58.tar.gz
Algorithm Hash digest
SHA256 c43267577e2a5caa619fdf1f9047e1fb1da33e1a3fda7319cc56052f1da50cd9
MD5 eeef3ab0905466c2b467b1df4e81451b
BLAKE2b-256 66fcac8a5651e1195b0c881a2c5574dbc36616bccf18c86fdba292586e737cc9

See more details on using hashes here.

File details

Details for the file typhoon_orchestrator-0.0.58-py3-none-any.whl.

File metadata

  • Download URL: typhoon_orchestrator-0.0.58-py3-none-any.whl
  • Upload date:
  • Size: 185.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.6.0 importlib_metadata/4.8.2 pkginfo/1.8.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.5

File hashes

Hashes for typhoon_orchestrator-0.0.58-py3-none-any.whl
Algorithm Hash digest
SHA256 50db2b468a1c9d5d1b6e5cc903ca82ad420f0f9e3e13a242d855afb8e11f37e6
MD5 ac8e76d96817a77d8865f13966be14bc
BLAKE2b-256 47e1459e862e3a90be5c90ac89fe9f0552193bb9c93b27263d6db5a3aeebddb7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page