Skip to main content

Kedro plugin with Azure ML Pipelines support

Project description

Kedro AzureML Pipeline

Python Version License PyPI Version codecov

[!NOTE] This project is a fork of kedro-azureml originally created by Marcin Zablocki at GetInData. It has been forked to continue active development and add new features.

What is Kedro AzureML Pipeline?

Kedro AzureML Pipeline is a plugin that enables running Kedro pipelines on Azure ML Pipelines. It translates your Kedro pipeline into an Azure ML pipeline job where each Kedro node becomes a separate step.

Two deployment workflows are supported, both backed by Azure ML Environments:

  • Code upload: only dependencies live in the Docker image; source code is uploaded at runtime (fast iteration for data scientists)
  • Docker image: code is baked into the image (stable, repeatable workflows for MLOps)

Key features

Feature Description
Pipeline translation Automatic Kedro node → Azure ML step mapping via the compile, run, and schedule CLI commands
Named jobs Define multiple jobs in azureml.yml, each targeting a different pipeline, compute, or workspace
Scheduling Attach cron or recurrence schedules to jobs for recurring Azure ML pipeline runs
Data assets AzureMLAssetDataset for reading/writing Azure ML uri_file and uri_folder data assets
Distributed training @distributed_job decorator with PyTorch, TensorFlow, and MPI backends
MLflow integration Optional hook that wires Kedro-MLFlow to log under the correct Azure ML experiment
Multiple workspaces Named workspace definitions with a __default__ fallback

Installation

pip install kedro-azureml-pipeline

or with uv:

uv add kedro-azureml-pipeline

Quick start

1. Initialize configuration

kedro azureml init

This creates conf/base/azureml.yml with placeholder values and an .amlignore file.

2. Review the generated configuration

Open conf/base/azureml.yml and fill in your Azure details:

workspace:
  __default__:
    subscription_id: "<subscription_id>"
    resource_group: "<resource_group>"
    name: "<workspace_name>"

compute:
  __default__:
    cluster_name: "<cluster_name>"

execution:
  environment: "<environment>"
  code_directory: "."

3. Define a job and submit

Add a job to azureml.yml:

jobs:
  training:
    pipeline:
      pipeline_name: "__default__"
    experiment_name: "my-experiment"

Then submit it:

kedro azureml submit -j training

Use --dry-run to preview without submitting, or --wait-for-completion to block until the run finishes.

4. Compile to YAML (optional)

Export the Azure ML pipeline definition for inspection or CI:

kedro azureml compile -j training -o pipeline.yaml

Documentation

Full documentation is available at https://kedro-azureml-pipeline.readthedocs.io/.

Contributing

We welcome contributions, feedback, and questions:

License

This project is licensed under the terms of the Apache-2.0 License.

Acknowledgements

This project is a fork of kedro-azureml, originally developed by GetInData. We are grateful for their work in creating the initial plugin that bridges Kedro and Azure ML Pipelines. We have continued development to add new features, improve documentation, and maintain the project under the kedro-azureml-pipeline package name.

We would also like to thank Evolta Technologies for their support to the project.


Evolta Technologies


This project is maintained by stateful-y, an ML consultancy specializing in MLOps and data science & engineering. If you're interested in collaborating or learning more about our services, please visit our website.

Made by stateful-y

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kedro_azureml_pipeline-0.1.0a2.tar.gz (208.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

kedro_azureml_pipeline-0.1.0a2-py3-none-any.whl (48.2 kB view details)

Uploaded Python 3

File details

Details for the file kedro_azureml_pipeline-0.1.0a2.tar.gz.

File metadata

  • Download URL: kedro_azureml_pipeline-0.1.0a2.tar.gz
  • Upload date:
  • Size: 208.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for kedro_azureml_pipeline-0.1.0a2.tar.gz
Algorithm Hash digest
SHA256 08e8e4b1263582424f73af94fc6ad95d3f6c7c8fb3e8c0d4b744674cc19a8355
MD5 58bcb9e82a7b23143d2ead740b866b17
BLAKE2b-256 5f1db989ca0666ddddd0b61c4c3e512dd1ff5166b480e9ea12599a0fb0a56e49

See more details on using hashes here.

Provenance

The following attestation bundles were made for kedro_azureml_pipeline-0.1.0a2.tar.gz:

Publisher: publish-release.yml on stateful-y/kedro-azureml-pipeline

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kedro_azureml_pipeline-0.1.0a2-py3-none-any.whl.

File metadata

File hashes

Hashes for kedro_azureml_pipeline-0.1.0a2-py3-none-any.whl
Algorithm Hash digest
SHA256 3f242c6361ac189611fe8c0e602860fc26b0e9af5cff3ec9069af754cb9ed7f5
MD5 12836d7406dccdc5f39a46fefd650aea
BLAKE2b-256 248123117800efd36fbb95a11875d90e3eca34fa05874d1a44a5555c0ac50464

See more details on using hashes here.

Provenance

The following attestation bundles were made for kedro_azureml_pipeline-0.1.0a2-py3-none-any.whl:

Publisher: publish-release.yml on stateful-y/kedro-azureml-pipeline

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page