A plugin to run Kedro pipelines on Databricks.
Project description
kedro-databricks
Kedro plugin to develop Kedro pipelines for Databricks. This plugin strives to provide the ultimate developer experience when using Kedro on Databricks.
Key Features
- Initialization: Transform your local Kedro project into a Databricks Asset Bundle.
- Generation: Generate Asset Bundle resources definition based from your kedro pipelines.
- Deployment: Deploy your Kedro pipelines to Databricks as Jobs.
- Execution: Run your Kedro pipelines on Databricks straight from the command line.
- Cleanup: Remove all Databricks resources created by the plugin.
Documentation & Contributing
To learn more about the plugin, please refer to the documentation.
Interested in contributing? Check out our contribution guidelines to get started!
Breaking Changes
Version 0.14.0
To accommodate using Databricks Free Edition, we had to change the structure of overrides defined in conf/<env>/databricks.yml.
Before:
default:
environments:
- environment_key: default
spec:
environment_version: '4'
dependencies:
- ../dist/*.whl
tasks:
- task_key: default
environment_key: default
After:
resources:
jobs:
default:
environments:
- environment_key: default
spec:
environment_version: '4'
dependencies:
- ../dist/*.whl
tasks:
- task_key: default
environment_key: default
This was done so that we could default to creating a volume in a newly initialized kedro-databricks project.
While this requires users to migrate their databricks configuration, it also extends the ability of kedro-databricks beyond that of applying overrides to specific jobs. Now, you can add any type of resource in your conf/<env>/databricks.yml and those will be generated as well.
NOTE: Merges are only applied for
jobscurrently, so any other defined will be generated as defined in the configuration.
In addition to the changes to the structure of conf/<env>/databricks.yml, we now also tag the generated resources with their resource type and target environment, meaning that newly generated resources will be named like target.<env>.<resource-type>.<resouce-name>.yml.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file kedro_databricks-0.14.1.tar.gz.
File metadata
- Download URL: kedro_databricks-0.14.1.tar.gz
- Upload date:
- Size: 1.7 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
34e07ca9e9e1bc0fafc55d24d270d3ef375b02d2756f268889f2090f3f70ac52
|
|
| MD5 |
b1b14db1930f6abbe5c833fc7a10d7d3
|
|
| BLAKE2b-256 |
9bca56022aa16874ce74dad05521f68997474bd879dee16f88fd23b6e2f88107
|
Provenance
The following attestation bundles were made for kedro_databricks-0.14.1.tar.gz:
Publisher:
publish.yml on JenspederM/kedro-databricks
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
kedro_databricks-0.14.1.tar.gz -
Subject digest:
34e07ca9e9e1bc0fafc55d24d270d3ef375b02d2756f268889f2090f3f70ac52 - Sigstore transparency entry: 760324197
- Sigstore integration time:
-
Permalink:
JenspederM/kedro-databricks@a68c44f5415b212349a60f92ffa5ee6030204264 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/JenspederM
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@a68c44f5415b212349a60f92ffa5ee6030204264 -
Trigger Event:
push
-
Statement type:
File details
Details for the file kedro_databricks-0.14.1-py3-none-any.whl.
File metadata
- Download URL: kedro_databricks-0.14.1-py3-none-any.whl
- Upload date:
- Size: 34.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
61d86e1aea7b7e6d5d0f10fda0b86d5c1a506e8f64eff6a6dab2297d7b0e802f
|
|
| MD5 |
c3c9afac106f06692f21ebd9c9b47cb9
|
|
| BLAKE2b-256 |
ba3e50af3528e211680079156686685d7fabcda87ec8059686e020b0f1a21d60
|
Provenance
The following attestation bundles were made for kedro_databricks-0.14.1-py3-none-any.whl:
Publisher:
publish.yml on JenspederM/kedro-databricks
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
kedro_databricks-0.14.1-py3-none-any.whl -
Subject digest:
61d86e1aea7b7e6d5d0f10fda0b86d5c1a506e8f64eff6a6dab2297d7b0e802f - Sigstore transparency entry: 760324214
- Sigstore integration time:
-
Permalink:
JenspederM/kedro-databricks@a68c44f5415b212349a60f92ffa5ee6030204264 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/JenspederM
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@a68c44f5415b212349a60f92ffa5ee6030204264 -
Trigger Event:
push
-
Statement type: