A tool for orchestrating and executing Jupyter notebooks, enabling seamless parameter passing between notebooks.
Project description
notebook-orchestration-and-execution-manager
Orchestrate Jupyter notebooks by passing parameters dynamically between them. This solution enables seamless execution, where the output of one notebook becomes the input for the next. Includes automated execution, parameter injection, logging, and output management for streamlined workflows.
Notebook Execution and Variable Extraction
This project provides a Python class and workflow to manage the execution of Jupyter notebooks with parameters, extract variables and their values from executed notebooks, and display the results in a structured format.
Features
- Execute Jupyter Notebooks: Run Jupyter notebooks with specified parameters using
papermill
. - Dynamic Parameter Passing: Pass custom parameters to notebooks during execution.
- Variable Extraction: Extract variable data (name, operation, and value) from executed notebook cells.
- Logging: Track execution steps with detailed logs.
- Directory Management: Automatically manage output directories for processed notebooks.
Requirements
- Python 3.6+
- Libraries:
os
,papermill
,logging
,ast
,IPython
Install dependencies via pip:
pip install notebook-orchestration-and-execution-manager
Usage
1. Initialize the NotebookOrchestrationExecutionManager
Create an instance of NotebookOrchestrationExecutionManager
, specifying the directory for processed notebooks.
from notebook_orchestration_execution_manager import NotebookOrchestrationExecutionManager
processor = NotebookOrchestrationExecutionManager(processed_directory="./processed_notebook")
1.1 Parameters Definition
How to Configure a Cell in JupyterLab to Receive Parameters
-
Select the cell you want to configure to receive parameters.
-
Click on the gear icon located on the right panel in JupyterLab. This will open the cell metadata editor.
-
Add a new tag:
- In the metadata editor, locate or create a field named
tags
. - Add a new tag called
parameters
.
- In the metadata editor, locate or create a field named
-
Save the changes and ensure the
parameters
tag has been added correctly.
The recommended practice is to define parameters in the first cell of the notebook. This ensures a clear structure, makes them easy to locate, and provides a centralized configuration that can be used throughout the notebook's execution.
Parameters can be defined in a Markdown, Raw, or Code cell, or even without explicitly defining a cell for this purpose. Parameter injection will automatically take place above the first code cell in the notebook. This provides greater flexibility when working with parameterization tools like Papermill or automating notebook execution in configurable environments.
In Mardown Cell
In Code Cell
No Definition Cell
1.2 Parameters Injection
2. Define Notebooks and Parameters
Provide a list of notebooks with input paths, output paths, and parameter dictionaries.
original_notebooks_path = './sample_notebooks'
processed_notebook_file_path = './processed_notebook'
notebooks_with_parameters = [
(f"{original_notebooks_path}/1_Add.ipynb", f"./{processed_notebook_file_path}/add_executed.ipynb", {"params": [10, 5, 7]}),
(f"{original_notebooks_path}/2_Subtract.ipynb", f"./{processed_notebook_file_path}/subtract_executed.ipynb", {"x": 10, "y": 3}),
(f"{original_notebooks_path}/3_Divide.ipynb", f"./{processed_notebook_file_path}/divide_executed.ipynb", {"x": 20, "y": 0}),
(f"{original_notebooks_path}/4_No_parameters.ipynb", f"./{processed_notebook_file_path}/no_parameters_executed.ipynb", {"inject_values": {"x": [2, 3], "y": [4, 5]}}),
(f"{original_notebooks_path}/5_Multiply.ipynb", f"./{processed_notebook_file_path}/multiply_executed.ipynb", {"inject_values": {"x": [2, 3], "y": [4, 5]}}),
]
3. Execute Notebooks
Run each notebook with parameters and save the results.
notebook_execution_results = []
for input_path, output_path, params in notebooks_with_parameters:
notebook_results = processor.run_notebook_with_parameters(input_path, output_path, params)
notebook_execution_results.append(notebook_results)
4. Extract Variables from Notebooks
Extract variable data and display it in a structured format.
variable_list = []
for notebook_result in notebook_execution_results:
if notebook_result:
extracted_data = processor.extract_variable_data_from_notebook_cells(notebook_result)
variable_list.append(processor.display_notebook_variables_and_values_extracted_from_notebook(extracted_data))
5. Retrieve the Variable Values from Every Notebook
variable_list
Code Breakdown
1. NotebookOrchestrationExecutionManager Class
Handles the execution of notebooks, directory creation, and variable extraction.
Methods
create_directory_if_not_exists(directory: str)
: Ensures the specified directory exists.run_notebook_with_parameters(notebook_input_path: str, notebook_output_path: str, params: dict)
: Executes a Jupyter notebook with parameters.extract_variable_data_from_notebook_cells(notebook_data: dict)
: Extracts variable data from notebook cells.display_notebook_variables_and_values_extracted_from_notebook(extracted_variables_data_from_notebook: dict)
: Displays extracted variable data in logs.
Example Workflow
Input Notebook
- File:
1_Add.ipynb
- Parameters:
{"params": [10, 5, 7]}
Output
- File:
./processed_notebook/add_executed.ipynb
- Logs: Execution details and extracted variables.
Logging
Logs include:
- Notebook execution status.
- Variable extraction details.
- Metadata from executed notebooks.
License
This project is licensed under the MIT License.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file notebook-orchestration-and-execution-manager-0.2.27.13.1.tar.gz
.
File metadata
- Download URL: notebook-orchestration-and-execution-manager-0.2.27.13.1.tar.gz
- Upload date:
- Size: 7.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.9.20
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | b359d703fa89271758564843bf77fcf80fa3b0f290fd7509128c373437e1407a |
|
MD5 | 9d2dad2bc3f08502f4fb45fbd95751f5 |
|
BLAKE2b-256 | 601532147a4455c84f5fbb1a219c5caeb457d11400ceca3f712241c161a2e3af |
File details
Details for the file notebook_orchestration_and_execution_manager-0.2.27.13.1-py3-none-any.whl
.
File metadata
- Download URL: notebook_orchestration_and_execution_manager-0.2.27.13.1-py3-none-any.whl
- Upload date:
- Size: 8.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.9.20
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 042414f1d42245b643bbc28f8741225cc7af9d5bf3917eedbc027de907f3ffb4 |
|
MD5 | 0145551faf367477f3053b7d73903cd9 |
|
BLAKE2b-256 | c98ee100da44dea62b96fa00d62a648772d7414a75cb91803f717b8462536645 |