Skip to main content

A plugin to run Kedro pipelines on Databricks.

Project description

kedro-databricks

uv Ruff License: MIT codecov Python Version Download/Month PyPI Version Read the Docs

Kedro plugin to develop Kedro pipelines for Databricks. This plugin strives to provide the ultimate developer experience when using Kedro on Databricks.

Key Features

  1. Initialization: Transform your local Kedro project into a Databricks Asset Bundle.
  2. Generation: Generate Asset Bundle resources definition based from your kedro pipelines.
  3. Deployment: Deploy your Kedro pipelines to Databricks as Jobs.
  4. Execution: Run your Kedro pipelines on Databricks straight from the command line.
  5. Cleanup: Remove all Databricks resources created by the plugin.

Documentation & Contributing

To learn more about the plugin, please refer to the documentation.

Interested in contributing? Check out our contribution guidelines to get started!

Breaking Changes

Version 0.14.0

To accommodate using Databricks Free Edition, we had to change the structure of overrides defined in conf/<env>/databricks.yml.

Before:

default:
    environments:
        - environment_key: default
    spec:
        environment_version: '4'
        dependencies:
            - ../dist/*.whl
    tasks:
        - task_key: default
          environment_key: default

After:

resources:
    jobs:
        default:
            environments:
                - environment_key: default
            spec:
                environment_version: '4'
                dependencies:
                    - ../dist/*.whl
            tasks:
                - task_key: default
                environment_key: default

This was done so that we could default to creating a volume in a newly initialized kedro-databricks project.

While this requires users to migrate their databricks configuration, it also extends the ability of kedro-databricks beyond that of applying overrides to specific jobs. Now, you can add any type of resource in your conf/<env>/databricks.yml and those will be generated as well.

NOTE: Merges are only applied for jobs currently, so any other defined will be generated as defined in the configuration.

In addition to the changes to the structure of conf/<env>/databricks.yml, we now also tag the generated resources with their resource type and target environment, meaning that newly generated resources will be named like target.<env>.<resource-type>.<resouce-name>.yml.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kedro_databricks-0.17.0.tar.gz (1.8 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

kedro_databricks-0.17.0-py3-none-any.whl (36.5 kB view details)

Uploaded Python 3

File details

Details for the file kedro_databricks-0.17.0.tar.gz.

File metadata

  • Download URL: kedro_databricks-0.17.0.tar.gz
  • Upload date:
  • Size: 1.8 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.6 {"installer":{"name":"uv","version":"0.11.6","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for kedro_databricks-0.17.0.tar.gz
Algorithm Hash digest
SHA256 25598833ea8ae2efcce5bfe9fc356fc0de4f85f4c2dc86e1d41c658d18ac7e14
MD5 f68bef3684a09dfc80ed60a27e2af314
BLAKE2b-256 7a57d22c8841e0908cba4b4ebac66d3c1faa5119cbc66c0d47d118d3a41d0acf

See more details on using hashes here.

File details

Details for the file kedro_databricks-0.17.0-py3-none-any.whl.

File metadata

  • Download URL: kedro_databricks-0.17.0-py3-none-any.whl
  • Upload date:
  • Size: 36.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.6 {"installer":{"name":"uv","version":"0.11.6","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for kedro_databricks-0.17.0-py3-none-any.whl
Algorithm Hash digest
SHA256 94ccad9c00adb90ebafe580c77e7dc018cb1f79d4dabd32f4adde0becaf0d990
MD5 e23d2f9b060653c823f4eabb5d32909b
BLAKE2b-256 b548a5c92046d744342d4ae4782e7ccf079d9f5ec73464b42dec8d7d8062c103

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page