Jupyter Notebook operator for Apache Airflow.
Project description
airflow-notebook
implements an Apache Airflow operator NotebookOp
that supports running of notebooks and Python scripts in DAGs.
To use the operator, install this package on the host(s) where the Apache Airflow webserver, scheduler, and workers are running.
Installing the airflow-notebook package
You can install the airflow-notebook
package from PyPI or source code.
Installing from PyPI
To install airflow-notebook
from PyPI:
pip install airflow-notebook
Installing from source code
To build airflow-notebook
from source, Python 3.6 (or later) must be installed.
git clone https://github.com/elyra-ai/airflow-notebook.git
cd airflow-notebook
make clean install
Test coverage
The operator was tested with Apache Airflow v1.10.12.
Usage
Example below on how to use the airflow operator. This particular DAG was generated with a jinja template in Elyra's pipeline editor.
from airflow import DAG
from airflow_notebook.pipeline import NotebookOp
from airflow.utils.dates import days_ago
# Setup default args with older date to automatically trigger when uploaded
args = {
'project_id': 'untitled-0105163134',
}
dag = DAG(
'untitled-0105163134',
default_args=args,
schedule_interval=None,
start_date=days_ago(1),
description='Created with Elyra 2.0.0.dev0 pipeline editor using untitled.pipeline.',
is_paused_upon_creation=False,
)
notebook_op_6055fdfb_908c_43c1_a536_637205009c79 = NotebookOp(name='notebookA',
namespace='default',
task_id='notebookA',
notebook='notebookA.ipynb',
cos_endpoint='http://endpoint.com:31671',
cos_bucket='test',
cos_directory='untitled-0105163134',
cos_dependencies_archive='notebookA-6055fdfb-908c-43c1-a536-637205009c79.tar.gz',
pipeline_outputs=[
'subdir/A.txt'],
pipeline_inputs=[],
image='tensorflow/tensorflow:2.3.0',
in_cluster=True,
env_vars={'AWS_ACCESS_KEY_ID': 'a_key',
'AWS_SECRET_ACCESS_KEY': 'a_secret_key', 'ELYRA_ENABLE_PIPELINE_INFO': 'True'},
config_file="None",
dag=dag,
)
notebook_op_074355ce_2119_4190_8cde_892a4bc57bab = NotebookOp(name='notebookB',
namespace='default',
task_id='notebookB',
notebook='notebookB.ipynb',
cos_endpoint='http://endpoint.com:31671',
cos_bucket='test',
cos_directory='untitled-0105163134',
cos_dependencies_archive='notebookB-074355ce-2119-4190-8cde-892a4bc57bab.tar.gz',
pipeline_outputs=[
'B.txt'],
pipeline_inputs=[
'subdir/A.txt'],
image='elyra/tensorflow:1.15.2-py3',
in_cluster=True,
env_vars={'AWS_ACCESS_KEY_ID': 'a_key',
'AWS_SECRET_ACCESS_KEY': 'a_secret_key', 'ELYRA_ENABLE_PIPELINE_INFO': 'True'},
config_file="None",
dag=dag,
)
notebook_op_074355ce_2119_4190_8cde_892a4bc57bab << notebook_op_6055fdfb_908c_43c1_a536_637205009c79
notebook_op_68120415_86c9_4dd9_8bd6_b2f33443fcc7 = NotebookOp(name='notebookC',
namespace='default',
task_id='notebookC',
notebook='notebookC.ipynb',
cos_endpoint='http://endpoint.com:31671',
cos_bucket='test',
cos_directory='untitled-0105163134',
cos_dependencies_archive='notebookC-68120415-86c9-4dd9-8bd6-b2f33443fcc7.tar.gz',
pipeline_outputs=[
'C.txt', 'C2.txt'],
pipeline_inputs=[
'subdir/A.txt'],
image='elyra/tensorflow:1.15.2-py3',
in_cluster=True,
env_vars={'AWS_ACCESS_KEY_ID': 'a_key',
'AWS_SECRET_ACCESS_KEY': 'a_secret_key', 'ELYRA_ENABLE_PIPELINE_INFO': 'True'},
config_file="None",
dag=dag,
)
notebook_op_68120415_86c9_4dd9_8bd6_b2f33443fcc7 << notebook_op_6055fdfb_908c_43c1_a536_637205009c79
Generated Airflow DAG
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for airflow_notebook-0.0.6-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d7e31b086c7c5bb84a7bbcc88270450ef2d039435c39649d729e30734d4b305b |
|
MD5 | 3d278871c07863ce5ae8663057fecb85 |
|
BLAKE2b-256 | db8884bd29863dec98bb088b1fab16b93da3a153173e426c613752400bb72a0f |