Skip to main content

Python package to extend Airflow functionality with CWL v1.0 support

Project description

[![Build Status](]( - **Travis CI**
[![Build Status](]( - **CWL conformance tests**

# cwl-airflow

### About
Python package to extend **[Apache-Airflow 1.9.0](**
functionality with **[CWL v1.0](** support.

### Check it out
1. Install *cwl-airflow*
$ pip3 install cwl-airflow --user --find-links
2. Run *demo*
$ cwl-airflow demo --auto
3. Open your [web browser](http://localhost:8080/admin/) to see the progress

### Read it if you have troubles with installation
1. Check the requirements
- Ubuntu 16.04.4
- python 3.5.2
- pip3
python3 --user
- setuptools
pip3 install setuptools --user
- docker
sudo apt-get update
sudo apt-get install apt-transport-https ca-certificates curl software-properties-common
curl -fsSL | sudo apt-key add -
sudo add-apt-repository "deb [arch=amd64] $(lsb_release -cs) stable"
sudo apt-get update
sudo apt-get install docker-ce
sudo groupadd docker
sudo usermod -aG docker $USER
Log out and log back in so that your group membership is re-evaluated.
- python3-dev
sudo apt-get install python3-dev

### Configuration
1. Initialize `cwl-airflow` with the following command
$ cwl-airflow init # consider using --refresh=5 --workers=4 options if you want the webserver to react faster
2. If you had **[Apache-Airflow v1.9.0](**
already installed and configured, you may skip this step
$ airflow initdb

### Running
#### Batch mode
To automatically monitor and process all the job files present in a specific folder
1. Make sure your job files include the following mandatory fields:
- `uid` - unique ID, string
- `output_folder` - absolute path the the folder to save result, string
- `workflow` - absolute path the the workflow to be run, string

Aditionally, job files may also include the `tmp_folder` parameter
to point to the temporary folder absolute path.
2. Put your JSON/YAML job files into the directory
set as `jobs` in `cwl` section of `airflow.cfg` file
(by default `~/airflow/cwl/jobs`)
3. Run Airflow scheduler:
$ airflow scheduler

#### Manual mode
To perform a single run of the specific CWL workflow and job files

cwl-airflow run WORKFLOW_FILE JOB_FILE
If `uid`, `output_folder`, `workflow` and `tmp_folder` fields are not present
in the job file, you may set the them with the following arguments:
-o, --outdir Output directory, default current directory
-t, --tmp Folder to store temporary data, default /tmp
-u, --uid Unique ID, default random uuid
#### Demo mode
1. Get the list of the available demo workflows to run
$ cwl-airflow demo
2. Run demo workflow from the list (if running on macOS, consider adding the directory where you
installed cwl-airflow package to the _**Docker / Preferences / File sharing**_ options)
$ cwl-airflow demo super-enhancer.cwl
3. Optionally, run `airflow webserver` to check workflow status (default [webserver link](http://localhost:8080/))
$ airflow webserver

Project details

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for cwl-airflow, version 1.0.7
Filename, size File type Python version Upload date Hashes
Filename, size cwl-airflow-1.0.7.tar.gz (6.2 MB) File type Source Python version None Upload date Hashes View hashes

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page