A CLI for creating a flexible Apache Airflow local development environment
Project description
Airlift
Introduction
Airlift is a Command Line Interface (CLI) tool designed to provide a local development environment for Apache Airflow with a simple but flexible interface. It is built on top of the official Airflow Helm Chart.
Requirements
Airlift requires the following software to be installed on your system:
- Helm
- Docker
- Kind
Below are the installation instructions for each of these tools on MacOS and Linux distributions.
Note: It is also recommended to allocate at least 4GB of RAM for Docker to run this service.
Option 1: Nix Installation
airlift
is packaged and available via the nix package manager.
Install nix
Install Nix via the recommended multi-user installation:
sh <(curl -L https://nixos.org/nix/install) --daemon
Install airlift
nix-shell -p airlift
This will install airlift
in a disposable shell. Once you exit
the shell, it will no longer be available for use.
Note: airlift
is only available on the 23.11
or unstable
nix channel.
You can also run a single airlift
command using this shell:
nix-shell -p airlift --command "airlift -h"
To install airlift
permanently in your environment, you can use nix-env
instead.
Option 2: Homebrew Installation
Homebrew is a package manager that we will use to install the necessary software. If you don't have Homebrew installed, you can install it by following these instructions:
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
Install Software
With Homebrew installed, you can now install Helm, Docker, and Kind.
Helm
brew install helm
Docker
brew install --cask docker
-OR- use a tool like Docker Desktop or Rancher Desktop
Kind
brew install kind
Note: This software was tested and validated working with kind v0.17. There are known issues with kind v0.20 and Rancher. If you are experiencing issues, please downgrade your kind installation by installing from the source/release binaries.
pip
Installation
Airlift can be installed using pip:
pip install airlift
Usage
The general syntax for using the Airlift CLI tool is:
airlift [subcommand] [options]
Subcommands and Options
1. start
Starts the Airflow service.
airlift start -d /path/to/dags -p /path/to/plugins -r /path/to/requirements.txt
Note: The DAG and Plugins folders are mounted directly to the airflow service for hot-reloading. When you make a change locally, it should automatically appear in the Airflow UI.
Note: Start times for Airflow can be upwards of 5 minutes, due to the bootstrapping, installation of required PyPi packages, and the creation of the Postgres database. This all depends on your local machines power & the complexity of your Airflow setup.
2. check
Checks if all pre-requisite software is installed.
airlift check
3. pause
Pauses the Airflow service.
airlift pause
4. unpause
Unpauses the Airflow service.
airlift unpause
5. remove
Removes all containers/clusters related to the airlift
service.
airlift remove
6. status
Checks the status of the service and whether or not it is reachable.
airlift status -P 8080
7. import_variables
Imports a variables.json
file to a running Airflow instance.
airlift import_variables -P 8080 -V /path/to/variables.json
8. run_dag
Runs a DAG given an ID.
airlift run_dag -P 8080 -D example_dag_id
Configuration files
Helm values.yaml
This file provides the configuration for the Airflow Helm chart. This can be used for things such as:
- Setting the Secrets Backend to AWS Secrets Manager
- Adding custom environment variables (such as connections)
- Changing the executor
- Modifying the memory allocation for the webserver/scheduler/workers
- Updating any
airflow.cfg
value.
Here's an example:
executor: "CeleryExecutor"
config:
core:
load_examples: 'False'
executor: CeleryExecutor
colored_console_log: 'False'
# Airflow scheduler settings
scheduler:
# hostAliases for the scheduler pod
hostAliases: []
# - ip: "127.0.0.1"
# hostnames:
# - "foo.local"
# - ip: "10.1.2.3"
# hostnames:
# - "foo.remote"
# If the scheduler stops heartbeating for 5 minutes (5*60s) kill the
# scheduler and let Kubernetes restart it
livenessProbe:
initialDelaySeconds: 10
timeoutSeconds: 20
failureThreshold: 5
periodSeconds: 60
command: ~
# Airflow 2.0 allows users to run multiple schedulers,
# However this feature is only recommended for MySQL 8+ and Postgres
replicas: 1
# Max number of old replicasets to retain
revisionHistoryLimit: ~
# Command to use when running the Airflow scheduler (templated).
command: ~
# Args to use when running the Airflow scheduler (templated).
args: ["bash", "-c", "exec airflow scheduler"]
You can find all the possible configuration overrides here: https://artifacthub.io/packages/helm/apache-airflow/airflow?modal=values
Note: By default, we disable the livenessProbe
checks for the scheduler & triggerer due to conflicts with Kind. See ./src/airlift/config/helm/values.yaml
for the exact config values
Airlift Configuration
The Airlift configuration file overrides all flag values to simplify starting the service.
For example, $HOME/.config/airlift/config.yaml
:
# config.yaml
dag_path: /path/to/dags
plugin_path: /path/to/plugins
requirements_file: /path/to/requirements.txt
helm_values_file: /path/to/values.yaml
extra_volume_mounts:
- hostPath=/my/cool/path,containerPath=/my/mounted/path,name=a_unique_name
cluster_config_file: /path/to/cluster/config.yaml
image: 'apache/airflow:2.6.0'
helm_chart_version: '1.0.0'
port: 8080
post_start_dag_id: 'example_dag_id'
In this example, dag_path
in the yaml file overrides the -d
setting, plugin_path
overrides the -p
setting, and so forth.
Using this configuration, you can now start the service using:
airlift start -c $HOME/.config/airlift/config.yaml
Examples
See here for examples with common configuration modifications.
FAQ
See here for Frequently Asked Questions
Motivation
The motivation behind the creation of Airlift is to simplify the process of setting up a local development environment for Apache Airflow. It aims to be a flexible tool that allows developers to easily configure and manage their Airflow instances with unlimited flexibility.
Support and Contribution
If you encounter any issues or have suggestions for improvements, feel free to open an issue on the GitHub repository. Contributions to the project are also welcome.
Contact
If you have questions or feedback about Airlift, please reach out by opening an issue on the GitHub repository.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file airlift-0.4.0.tar.gz
.
File metadata
- Download URL: airlift-0.4.0.tar.gz
- Upload date:
- Size: 20.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.4.2 CPython/3.9.16 Linux/6.5.0-1018-azure
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 25c5b615797e4ab76f791986e5b0f5b6d7fa1772f0bc66501786424e90e93daf |
|
MD5 | c15c4297049ec323755b543d364acf5c |
|
BLAKE2b-256 | e76b02b1d3c3559ada8ae54f056f73afbcaffc3c2c273be38d8469827662d4ca |
File details
Details for the file airlift-0.4.0-py3-none-any.whl
.
File metadata
- Download URL: airlift-0.4.0-py3-none-any.whl
- Upload date:
- Size: 26.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.4.2 CPython/3.9.16 Linux/6.5.0-1018-azure
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | d57000f993a13c80480f0fcdabe5e544bd28c17c8321d92266f85d82034811eb |
|
MD5 | 088edd1ff32476829864f277f33acb21 |
|
BLAKE2b-256 | d5af044fc902561133b11ba8b539fad1450472e7becb24f42238caca40f81e98 |