Neptune.ai integration with Kedro

These details have not been verified by PyPI

Project links

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

Kedro-Neptune plugin

Main docs page for Kedro-Neptune plugin

See this example in Neptune Kedro pipeline metadata in custom dashboard in the Neptune UI

What will you get with this integration?

Kedro is a popular open-source project that helps standardize ML workflows. It gives you a clean and powerful pipeline abstraction where you put all your ML code logic.

Kedro-Neptune plugin lets you have all the benefits of a nicely organized kedro pipeline with a powerful user interface built for ML metadata management that lets you:

browse, filter, and sort your model training runs
compare nodes and pipelines on metrics, visual node outputs, and more
display all pipeline metadata including learning curves for metrics, plots, and images, rich media like video and audio or interactive visualizations from Plotly, Altair, or Bokeh
and do whatever else you would expect from a modern ML metadata store

Installation

Before you start, make sure that:

You have Python 3.6+ in your system,
You are already a registered user so that you can log metadata to your private projects.
You have your Neptune API token set to the NEPTUNE_API_TOKENenvironment variable.

Install neptune-client, kedro, and kedro-neptune

Depending on your operating system open a terminal or CMD and run this command. All required libraries are available via pip and conda:

pip install neptune-client kedro kedro-neptune

For more, see installing neptune-client.

Quickstart

See code examples on GitHub

See runs logged to Neptune

This quickstart will show you how to:

Connect Neptune to your Kedro project
Log pipeline and dataset metadata to Neptune
Add explicit metadata logging to a node in your pipeline
Explore logged metadata in the Neptune UI.

Before you start

Step 1: Create a Kedro project from "pandas-iris" starter

Go to your console and create a Kedro starter project "pandas-iris"

kedro new --starter=pandas-iris

Follow instructions and choose a name for your Kedro project. For example, "Great-Kedro-Project"
Go to your new Kedro project directory

If everything was set up correctly you should see the following directory structure:

Great-Kedro-Project # Parent directory of the template
├── conf            # Project configuration files
├── data            # Local project data (not committed to version control)
├── docs            # Project documentation
├── logs            # Project output logs (not committed to version control)
├── notebooks       # Project related Jupyter notebooks (can be used for experimental code before moving the code to src)
├── README.md       # Project README
├── setup.cfg       # Configuration options for `pytest` when doing `kedro test` and for the `isort` utility when doing `kedro lint`
├── src             # Project source code
    ├── pipelines   
        ├── data_science
            ├── nodes.py
            ├── pipelines.py
            └── ...

You will use nodes.py and pipelines.py files in this quickstart.

Step 2: Initialize kedro-neptune plugin

Go to your Kedro project directory and run

kedro neptune init

The command line will ask for your Neptune API token

Input your Neptune API token:
- Press enter if it was set to the NEPTUNE_API_TOKEN environment variable
- Pass a different environment variable to which you set your Neptune API token. For example MY_SPECIAL_NEPTUNE_TOKEN_VARIABLE
- Pass your Neptune API token as a string

The command line will ask for your Neptune project name

Input your Neptune project name:
- Press enter if it was set to the NEPTUNE_PROJECT environment variable
- Pass a different environment variable to which you set your Neptune project name. For example MY_SPECIAL_NEPTUNE_PROJECT_VARIABLE
- Pass your project name as a string in a format WORKSPACE/PROJECT

If everything was set up correctly you should:

see the message: "kedro-neptune plugin successfully configured"
see three new files in your kedro project:
- Credentials file:YOUR_KEDRO_PROJECT/conf/local/credentials_neptune.yml
- Config file:YOUR_KEDRO_PROJECT/conf/base/neptune.yml
- Catalog file:YOUR_KEDRO_PROJECT/conf/base/neptune_catalog.yml

You can always go to those files and change the initial configuration.

Step 3: Add Neptune logging to a Kedro node

Go to a pipeline node src/KEDRO_PROJECT/pipelines/data_science/nodes.py
Import Neptune client toward the top of the nodes.py

import neptune.new as neptune

Add neptune_run argument of type neptune.run.Handler to the report_accuracy function

def report_accuracy(predictions: np.ndarray, test_y: pd.DataFrame, 
                    neptune_run: neptune.run.Handler) -> None:
...

You can treat neptune_run like a normal Neptune Run and log any ML metadata to it.

Important
You have to use a special string "neptune_run" to use the Neptune Run handler in Kedro pipelines.

Log metrics like accuracy to neptune_run

def report_accuracy(predictions: np.ndarray, test_y: pd.DataFrame, 
                    neptune_run: neptune.run.Handler) -> None:
    target = np.argmax(test_y.to_numpy(), axis=1)
    accuracy = np.sum(predictions == target) / target.shape[0]
    
    neptune_run['nodes/report/accuracy'] = accuracy * 100

You can log metadata from any node to any Neptune namespace you want.

Log images like a confusion matrix to neptune_run

def report_accuracy(predictions: np.ndarray, test_y: pd.DataFrame, 
                    neptune_run: neptune.run.Handler) -> None:
    target = np.argmax(test_y.to_numpy(), axis=1)
    accuracy = np.sum(predictions == target) / target.shape[0]
    
    fig, ax = plt.subplots()
    plot_confusion_matrix(target, predictions, ax=ax)
    neptune_run['nodes/report/confusion_matrix'].upload(fig)

Note
You can log metrics, text, images, video, interactive visualizations, and more.
See a full list of What you can log and display in Neptune.

Step 4: Add Neptune Run handler to the Kedro pipeline

Go to a pipeline definition, src/KEDRO_PROJECT/pipelines/data_science/pipelines.py
Add neptune_run Run handler as an input to the report node

node(
    report_accuracy,
    ["example_predictions", "example_test_y", "neptune_run"],
    None,
    name="report"),

Step 5: Run Kedro pipeline

Go to your console and execute your Kedro pipeline

kedro run

A link to the Neptune Run associated with the Kedro pipeline execution will be printed to the console.

Step 6: Explore results in the Neptune UI

Click on the Neptune Run link in your console or use an example link

https://app.neptune.ai/common/kedro-integration/e/KED-632

Go to the kedro namespace where metadata about Kedro pipelines are logged (see how to change the default logging location)

Default Kedro namespace in Neptune UI

See pipeline and node parameters in kedro/catalog/parameters

Pipeline parameters logged from Kedro to Neptune UI

See execution parameters in kedro/run_params

Execution parameters logged from Kedro to Neptune UI

See metadata about the datasets in kedro/catalog/datasets/example_iris_data

Dataset metadata logged from Kedro to Neptune UI

See the metrics (accuracy) you logged explicitly in the kedro/nodes/report/accuracy

Metrics logged from Kedro to Neptune UI

See charts (confusion matrix) you logged explicitly in the kedro/nodes/report/confusion_matrix

Confusion matrix logged from Kedro to Neptune UI

Project details

These details have not been verified by PyPI

Project links

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

0.4.0

Feb 16, 2024

0.3.0

Nov 10, 2023

0.2.0

Jun 12, 2023

0.1.6

Mar 31, 2023

0.1.5

Mar 10, 2023

0.1.4

Nov 7, 2022

0.1.3

Sep 12, 2022

0.1.2

Aug 25, 2022

0.1.1

May 25, 2022

0.1.0

Apr 11, 2022

0.0.8

Mar 16, 2022

0.0.7

Dec 10, 2021

This version

0.0.6

Nov 8, 2021

0.0.5

Sep 8, 2021

0.0.4

Aug 5, 2021

0.0.3

Aug 5, 2021

0.0.1

Aug 4, 2021

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kedro-neptune-0.0.6.tar.gz (38.5 kB view hashes)

Uploaded Nov 8, 2021 Source

Hashes for kedro-neptune-0.0.6.tar.gz

Hashes for kedro-neptune-0.0.6.tar.gz
Algorithm	Hash digest
SHA256	`a3f886e49469c5e489791f93faed96cb0064f24f04228dcea0849198584b68a2`
MD5	`efe57347f3818474b215092303ab0058`
BLAKE2b-256	`7629a2485efcae62ed71f2b2526f8158ed38404eae7793835df2cc1f616227b5`