An Airflow plugin to launch and monitor Spark applications on the Data Mechanics platform
Project description
# Data Mechanics Airflow integration
## Spin up Airflow
If you haven’t got just on your machine already, install it with
`bash brew install just `
Then run Airflow with
`bash just serve `
> The script will ask you to install docker-compose if you haven’t got it on your machine already. > You can find it [here](https://docs.docker.com/install/).
> The first run will be long because Docker images are downloaded.
Shut down Airflow with Ctrl+C.
## Before the demo: one-time operations
Open Airflow at [http://localhost:8080](http://localhost:8080).
These are some of the gotchas you might run into when you’re not used to Airflow.
Activate the DAGs you want to run by toggling their state from Off to On (on the left in the DAGs page)
Create a Data Mechanics connection. To do this, click on Admin in the navbar, then Connections in the dropdown menu, then go to the Create tab. The connection should have datamechanics_default as connection name, https://demo.datamechanics.co/ as host, and our usual API key for the demo cluster as password. Leave the rest blank.
> As long as you don’t trash the anonymous Docker compose volume on which the airflow db is persisted, you shouldn’t have to repeat the operations above, even if you restart Airflow.
## Do the demo
Open Airflow at [http://localhost:8080](http://localhost:8080).
Open the DAG full-example.
Switch to Graph view
Trigger the DAG (Trigger DAG)
Explain that the first two tasks are run in parallel (you can show the dashboard at this point)
The two last tasks are meant to fail. Click on the failed execution of the failed-app task, click on View logs, and show that the URL to the dashboard is provided in the logs
## Turn this demo into a library
The code that should be turned into an Airflow plugin library is contained in folder plugins/.
# Changelog
All notable changes to this project will be documented in this file.
The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/) and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.html).
Date format is YYYY-MM-DD
## [1.0.0] 2020-09-11
### Changed
Converted the existing plugin into a Python package
[unreleased]: https://github.com/datamechanics/datamechanics_airflow_plugin/compare/v1.0.0…master [1.0.0]: https://github.com/datamechanics/datamechanics_airflow_plugin/compare/…v1.0.0
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file datamechanics_airflow_plugin-1.0.0.tar.gz
.
File metadata
- Download URL: datamechanics_airflow_plugin-1.0.0.tar.gz
- Upload date:
- Size: 6.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.14.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.7.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | bf94bc0259f59b45fa51a2f310a6fb2c30f1d2cc5ecb1589a9b2de71c3740f3e |
|
MD5 | 5b0d605406d595d390d6eda12b4a0e01 |
|
BLAKE2b-256 | 43989b83b83734102bc4d842f6827ed8165852061eee55249af85307e046b0a5 |
File details
Details for the file datamechanics_airflow_plugin-1.0.0-py2.py3-none-any.whl
.
File metadata
- Download URL: datamechanics_airflow_plugin-1.0.0-py2.py3-none-any.whl
- Upload date:
- Size: 8.1 kB
- Tags: Python 2, Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.14.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.7.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | a883e0b1ad52404bed6427ac56e897df399f8d879a0d3b3b774ba0b39d30e3e3 |
|
MD5 | 938099a2b2849f00df4fc341a9a9e06c |
|
BLAKE2b-256 | 64f55f6eab24d54bf919a6ef1d72b97d20b97289c37e6a834de610c9b24901b4 |