An Airflow plugin to launch and monitor Spark applications on the Data Mechanics platform
Project description
Data Mechanics Airflow integration
Spin up Airflow
If you haven't got just
on your machine already, install it with
brew install just
Then run Airflow with
just serve
The script will ask you to install
docker-compose
if you haven't got it on your machine already. You can find it here.
The first run will be long because Docker images are downloaded.
Shut down Airflow with Ctrl+C
.
Before the demo: one-time operations
Open Airflow at http://localhost:8080.
These are some of the gotchas you might run into when you're not used to Airflow.
- Activate the DAGs you want to run by toggling their state from
Off
toOn
(on the left in theDAGs
page) - Create a Data Mechanics connection. To do this, click on
Admin
in the navbar, thenConnections
in the dropdown menu, then go to theCreate
tab. The connection should havedatamechanics_default
as connection name,https://demo.datamechanics.co/
as host, and our usual API key for the demo cluster as password. Leave the rest blank.
As long as you don't trash the anonymous Docker compose volume on which the airflow db is persisted, you shouldn't have to repeat the operations above, even if you restart Airflow.
Do the demo
- Open Airflow at http://localhost:8080.
- Open the DAG
full-example
. - Switch to
Graph view
- Trigger the DAG (
Trigger DAG
) - Explain that the first two tasks are run in parallel (you can show the dashboard at this point)
- The two last tasks are meant to fail. Click on the failed execution of the
failed-app
task, click onView logs
, and show that the URL to the dashboard is provided in the logs
Turn this demo into a library
The code that should be turned into an Airflow plugin library is contained in folder plugins/
.
Changelog
All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog and this project adheres to Semantic Versioning.
Date format is YYYY-MM-DD
1.0.0 2020-09-11
Changed
- Converted the existing plugin into a Python package
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file datamechanics_airflow_plugin-1.0.5.tar.gz
.
File metadata
- Download URL: datamechanics_airflow_plugin-1.0.5.tar.gz
- Upload date:
- Size: 6.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.14.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.7.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | f89cae3f8348196f793818dec9b96c9dd954050da629ed248570af72b43d0182 |
|
MD5 | 47eb6cff261d87855dbd5dec19ebf2bb |
|
BLAKE2b-256 | 158126ba7d661a24f23cec754454f75bb9f27176dd730fe7233954702a5ee506 |
File details
Details for the file datamechanics_airflow_plugin-1.0.5-py2.py3-none-any.whl
.
File metadata
- Download URL: datamechanics_airflow_plugin-1.0.5-py2.py3-none-any.whl
- Upload date:
- Size: 8.1 kB
- Tags: Python 2, Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.14.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.7.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5f29825f30b3ab567fe091287a3481ee0f941368befea76b3422a9e8d3fcd520 |
|
MD5 | 8773a7ed865359e502884d5a0f069845 |
|
BLAKE2b-256 | 9cd3e4e7ecc8382a1ae866538bd2aabe396bb4b9083249af17fcc7548d430a09 |