Jupyter Notebook operator for Apache Airflow.
Project description
airflow-notebook
implements an Apache Airflow operator NotebookOp
that supports running of notebooks and Python scripts in DAGs.
To use the operator, configure Airflow to use the Elyra-enabled container image or install this package on the host(s) where the Apache Airflow webserver, scheduler, and workers are running.
Using the Elyra-enabled airflow container image
Follow the instructions in this document.
Installing the airflow-notebook package
You can install the airflow-notebook
package from PyPI or source code.
Installing from PyPI
To install airflow-notebook
from PyPI:
pip install airflow-notebook
Installing from source code
To build airflow-notebook
from source, Python 3.6 (or later) must be installed.
git clone https://github.com/elyra-ai/airflow-notebook.git
cd airflow-notebook
make clean install
Test coverage
The operator was tested with Apache Airflow v1.10.12.
Usage
Example below on how to use the airflow operator. This particular DAG was generated with a jinja template in Elyra's pipeline editor.
from airflow import DAG
from airflow_notebook.pipeline import NotebookOp
from airflow.utils.dates import days_ago
# Setup default args with older date to automatically trigger when uploaded
args = {
'project_id': 'untitled-0105163134',
}
dag = DAG(
'untitled-0105163134',
default_args=args,
schedule_interval=None,
start_date=days_ago(1),
description='Created with Elyra 2.0.0.dev0 pipeline editor using untitled.pipeline.',
is_paused_upon_creation=False,
)
notebook_op_6055fdfb_908c_43c1_a536_637205009c79 = NotebookOp(name='notebookA',
namespace='default',
task_id='notebookA',
notebook='notebookA.ipynb',
cos_endpoint='http://endpoint.com:31671',
cos_bucket='test',
cos_directory='untitled-0105163134',
cos_dependencies_archive='notebookA-6055fdfb-908c-43c1-a536-637205009c79.tar.gz',
pipeline_outputs=[
'subdir/A.txt'],
pipeline_inputs=[],
image='tensorflow/tensorflow:2.3.0',
in_cluster=True,
env_vars={'AWS_ACCESS_KEY_ID': 'a_key',
'AWS_SECRET_ACCESS_KEY': 'a_secret_key', 'ELYRA_ENABLE_PIPELINE_INFO': 'True'},
config_file="None",
dag=dag,
)
notebook_op_074355ce_2119_4190_8cde_892a4bc57bab = NotebookOp(name='notebookB',
namespace='default',
task_id='notebookB',
notebook='notebookB.ipynb',
cos_endpoint='http://endpoint.com:31671',
cos_bucket='test',
cos_directory='untitled-0105163134',
cos_dependencies_archive='notebookB-074355ce-2119-4190-8cde-892a4bc57bab.tar.gz',
pipeline_outputs=[
'B.txt'],
pipeline_inputs=[
'subdir/A.txt'],
image='elyra/tensorflow:1.15.2-py3',
in_cluster=True,
env_vars={'AWS_ACCESS_KEY_ID': 'a_key',
'AWS_SECRET_ACCESS_KEY': 'a_secret_key', 'ELYRA_ENABLE_PIPELINE_INFO': 'True'},
config_file="None",
dag=dag,
)
notebook_op_074355ce_2119_4190_8cde_892a4bc57bab << notebook_op_6055fdfb_908c_43c1_a536_637205009c79
notebook_op_68120415_86c9_4dd9_8bd6_b2f33443fcc7 = NotebookOp(name='notebookC',
namespace='default',
task_id='notebookC',
notebook='notebookC.ipynb',
cos_endpoint='http://endpoint.com:31671',
cos_bucket='test',
cos_directory='untitled-0105163134',
cos_dependencies_archive='notebookC-68120415-86c9-4dd9-8bd6-b2f33443fcc7.tar.gz',
pipeline_outputs=[
'C.txt', 'C2.txt'],
pipeline_inputs=[
'subdir/A.txt'],
image='elyra/tensorflow:1.15.2-py3',
in_cluster=True,
env_vars={'AWS_ACCESS_KEY_ID': 'a_key',
'AWS_SECRET_ACCESS_KEY': 'a_secret_key', 'ELYRA_ENABLE_PIPELINE_INFO': 'True'},
config_file="None",
dag=dag,
)
notebook_op_68120415_86c9_4dd9_8bd6_b2f33443fcc7 << notebook_op_6055fdfb_908c_43c1_a536_637205009c79
Generated Airflow DAG
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for airflow_notebook-0.0.7-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 216b2e2e84625b32554261a11e2f8ea5b1beeb673fb1e7b906f0cb8e87a3265c |
|
MD5 | 3f6a609c5077ff5e55794cfad19b1b00 |
|
BLAKE2b-256 | de6693715fac2a49da3927ee335b6415abc04f8748be970fee4c8969bb3649f9 |