Identify the bottleneck of your Kedro Pipeline quickly
Project description
kedro-profile
Identify the bottleneck of your Kedro Pipeline quickly with kedro-profile
Example
You will see something similar to this when running the plugin with spaceflight project:
==========Node Summary==========
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Node Name ┃ Loading Time(s) ┃ Node Compute Time(s) ┃ Saving Time(s) ┃ Total Time(s) ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ preprocess_shuttles_node │ 1.65 │ 0.01 │ 0.01 │ 1.68 │
│ create_model_input_table_node │ 0.01 │ 0.03 │ 0.02 │ 0.06 │
│ preprocess_companies_node │ 0.01 │ 0.01 │ 0.02 │ 0.03 │
└───────────────────────────────┴─────────────────┴──────────────────────┴────────────────┴───────────────┘
==========Dataset Summary==========
┏━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Dataset Name ┃ Loading Time(s) ┃ Load Count ┃ Saving Time(s) ┃ Save Count ┃ Total Time(s) ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ preprocessed_shuttles │ 0.02 │ 1.0 │ 0.01 │ 1.0 │ 0.03 │
│ preprocessed_companies │ 0.0 │ 1.0 │ 0.02 │ 1.0 │ 0.02 │
│ companies │ 0.01 │ 1.0 │ nan │ nan │ nan │
│ shuttles │ 1.65 │ 1.0 │ nan │ nan │ nan │
│ reviews │ 0.01 │ 1.0 │ nan │ nan │ nan │
│ model_input_table │ nan │ nan │ 0.02 │ 1.0 │ nan │
└────────────────────────┴─────────────────┴────────────┴────────────────┴────────────┴───────────────┘
Requirements
kedro>=0.18.5 # Minimal version for hook specifications
pandas>=1.0.0
Get Started
If you do not have kedro installed already, install kedro with:
pip install kedro
Then create an example project with this command:
kedro new --example=yes --tools=none --name kedro-profile-example
If you are cloning the repository, the project is already created here
This will create a new directorykedro-profile-example in your current directory.
Enable the Profiling Hook
You will find this line in settings.py, update it as follow:
from kedro_profile.hook import ProfileHook
HOOKS: tuple[ProfileHook] = (
ProfileHook(
save_file=True, # Enable CSV file saving
node_profile_path="data/08_reporting/profiling/node_profile.csv",
dataset_profile_path="data/08_reporting/profiling/dataset_profile.csv",
),
)
Configuration Options
save_file: Boolean to enable/disable CSV file saving (default: False)node_profile_path: Path for node performance CSV file (default: "node_profile.csv")dataset_profile_path: Path for dataset performance CSV file (default: "dataset_profile.csv")env: Environment filter (default: "local")
Example Configurations
Save to custom directory:
HOOKS: tuple[ProfileHook] = (
ProfileHook(
save_file=True,
node_profile_path="reports/node_performance.csv",
dataset_profile_path="reports/dataset_performance.csv",
),
)
Disable CSV saving (console output only):
HOOKS: tuple[ProfileHook] = (
ProfileHook(save_file=False),
)
Output
The plugin generates two CSV files when save_file=True:
- Node Profile: Contains node execution times and performance metrics
- Dataset Profile: Contains dataset loading/saving times and access counts
Both files include:
- Load/Save counts
- Loading/Saving times
- Total time calculations
- Sorted by total time (descending)
Environment Variables
KEDRO_PROFILE_DISABLE=1: Disable profilingKEDRO_PROFILE_RICH=0: Disable rich console output
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file kedro_profile-0.0.2.tar.gz.
File metadata
- Download URL: kedro_profile-0.0.2.tar.gz
- Upload date:
- Size: 4.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: pdm/2.12.3 CPython/3.11.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a2b7cfea6eb8d8648eb569bd641f1b7b9dd97e0bbb1c85f636b9ecf88821cea7
|
|
| MD5 |
cc849748a80a63d2ea1f7f33b41c23b4
|
|
| BLAKE2b-256 |
9c70b5e72888fc233ec81d4f76b10e0a44ee74e2f5655680887cfd1f70f4eac7
|
File details
Details for the file kedro_profile-0.0.2-py3-none-any.whl.
File metadata
- Download URL: kedro_profile-0.0.2-py3-none-any.whl
- Upload date:
- Size: 4.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: pdm/2.12.3 CPython/3.11.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2edd9025f9b94575fc4880d1d0cd46239b6a5563fe5d2fb439ca65e98860d365
|
|
| MD5 |
709c6a9bb768af72c01c38993a2f9f9f
|
|
| BLAKE2b-256 |
37135ecd8f11bb939f899d8622350e8039ab83bfc70b36e9102465d450ce3060
|