Skip to main content

Kedro plugin with Azure ML Pipelines support

Project description

Kedro AzureML Pipeline

Python Version License PyPI Version codecov

[!NOTE] This project is a fork of kedro-azureml originally created by Marcin Zablocki at GetInData. It has been forked to continue active development and add new features.

What is Kedro AzureML Pipeline?

Kedro AzureML Pipeline is a plugin that enables running Kedro pipelines on Azure ML Pipelines. It translates your Kedro pipeline into an Azure ML pipeline job where each Kedro node becomes a separate step.

Two deployment workflows are supported, both backed by Azure ML Environments:

  • Code upload: only dependencies live in the Docker image; source code is uploaded at runtime (fast iteration for data scientists)
  • Docker image: code is baked into the image (stable, repeatable workflows for MLOps)

Key features

Feature Description
Pipeline translation Automatic Kedro node → Azure ML step mapping via the compile, run, and schedule CLI commands
Named jobs Define multiple jobs in azureml.yml, each targeting a different pipeline, compute, or workspace
Scheduling Attach cron or recurrence schedules to jobs for recurring Azure ML pipeline runs
Data assets AzureMLAssetDataset for reading/writing Azure ML uri_file and uri_folder data assets
Distributed training @distributed_job decorator with PyTorch, TensorFlow, and MPI backends
MLflow integration Optional hook that wires Kedro-MLFlow to log under the correct Azure ML experiment
Multiple workspaces Named workspace definitions with a __default__ fallback

Installation

pip install kedro-azureml-pipeline

or with uv:

uv add kedro-azureml-pipeline

Quick start

1. Initialize configuration

kedro azureml init

This creates conf/base/azureml.yml with placeholder values and an .amlignore file.

2. Review the generated configuration

Open conf/base/azureml.yml and fill in your Azure details:

workspace:
  __default__:
    subscription_id: "<subscription_id>"
    resource_group: "<resource_group>"
    name: "<workspace_name>"

compute:
  __default__:
    cluster_name: "<cluster_name>"

execution:
  environment: "<environment>"
  code_directory: "."

3. Define a job and submit

Add a job to azureml.yml:

jobs:
  training:
    pipeline:
      pipeline_name: "__default__"
    experiment_name: "my-experiment"

Then submit it:

kedro azureml submit -j training

Use --dry-run to preview without submitting, or --wait-for-completion to block until the run finishes.

4. Compile to YAML (optional)

Export the Azure ML pipeline definition for inspection or CI:

kedro azureml compile -j training -o pipeline.yaml

Documentation

Full documentation is available at https://kedro-azureml-pipeline.readthedocs.io/.

Contributing

We welcome contributions, feedback, and questions:

License

This project is licensed under the terms of the Apache-2.0 License.

Acknowledgements

This project is a fork of kedro-azureml, originally developed by GetInData. We are grateful for their work in creating the initial plugin that bridges Kedro and Azure ML Pipelines. We have continued development to add new features, improve documentation, and maintain the project under the kedro-azureml-pipeline package name.

We would also like to thank Evolta Technologies for their support to the project.


Evolta Technologies


This project is maintained by stateful-y, an ML consultancy specializing in MLOps and data science & engineering. If you're interested in collaborating or learning more about our services, please visit our website.

Made by stateful-y

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kedro_azureml_pipeline-0.1.0a1.tar.gz (207.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

kedro_azureml_pipeline-0.1.0a1-py3-none-any.whl (47.6 kB view details)

Uploaded Python 3

File details

Details for the file kedro_azureml_pipeline-0.1.0a1.tar.gz.

File metadata

  • Download URL: kedro_azureml_pipeline-0.1.0a1.tar.gz
  • Upload date:
  • Size: 207.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for kedro_azureml_pipeline-0.1.0a1.tar.gz
Algorithm Hash digest
SHA256 57b5865fea03d880c8952f09e1484a2433c3520af48410ad8b1548082919bc89
MD5 1730a4e9eae667314f7fb0f1cf97a75d
BLAKE2b-256 d7d7012002650b5c9616a3bce2409a5adcc707ec921691024a4de1c269c9f597

See more details on using hashes here.

Provenance

The following attestation bundles were made for kedro_azureml_pipeline-0.1.0a1.tar.gz:

Publisher: publish-release.yml on stateful-y/kedro-azureml-pipeline

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kedro_azureml_pipeline-0.1.0a1-py3-none-any.whl.

File metadata

File hashes

Hashes for kedro_azureml_pipeline-0.1.0a1-py3-none-any.whl
Algorithm Hash digest
SHA256 67a89eb655a2c628a5229489f4d6f8aba8f8268d078f9d5c32a48e5cb235c9b1
MD5 e8f1fda0462dd7f51f02cf9716e50f50
BLAKE2b-256 de10f7b376ed11ee61b09fda0e127a3f30fadfd4bf6187924aa828f389e7856b

See more details on using hashes here.

Provenance

The following attestation bundles were made for kedro_azureml_pipeline-0.1.0a1-py3-none-any.whl:

Publisher: publish-release.yml on stateful-y/kedro-azureml-pipeline

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page