Decorator-based framework for defining Databricks jobs and tasks as Python code.
Project description
databricks-bundle-decorators
Decorator-based framework for defining Databricks jobs and tasks as Python code. Define pipelines using @task, @job, and job_cluster() — they compile into Databricks Asset Bundle resources.
Why databricks-bundle-decorators?
Writing Databricks jobs in raw YAML is tedious and disconnects task logic from orchestration configuration. databricks-bundle-decorators lets you express both in Python:
- Airflow TaskFlow-inspired pattern — define
@taskfunctions inside a@jobbody; dependencies are captured automatically from call arguments. - IoManager pattern — large data (DataFrames, datasets) flows between tasks through external storage automatically.
- Explicit task values — small scalars (
str,int,float,bool) can be passed between tasks viaset_task_value/get_task_value, like Airflow XComs. - Pure Python — write your jobs and tasks as decorated functions, run
databricks bundle deploy, and the framework generates all Databricks Job configurations for you.
Installation
uv add databricks-bundle-decorators
With cloud-specific extras for the built-in PolarsParquetIoManager:
uv add databricks-bundle-decorators[azure] # or [aws], [gcp], [polars]
Quickstart
uv init my-pipeline && cd my-pipeline
uv add databricks-bundle-decorators[azure]
uv run dbxdec init
This scaffolds a complete pipeline project. Define your jobs in src/<package>/pipelines/:
import polars as pl
from databricks_bundle_decorators import job, job_cluster, params, task
from databricks_bundle_decorators.io_managers import PolarsParquetIoManager
io = PolarsParquetIoManager(
base_path="abfss://lake@account.dfs.core.windows.net/staging",
)
cluster = job_cluster(
name="small",
spark_version="16.4.x-scala2.12",
node_type_id="Standard_E8ds_v4",
num_workers=1,
)
@job(
params={"url": "https://api.github.com/events"},
cluster=cluster,
)
def my_pipeline():
@task(io_manager=io)
def extract() -> pl.DataFrame:
import requests
return pl.DataFrame(requests.get(params["url"]).json())
@task
def transform(df: pl.DataFrame):
print(df.head(10))
data = extract()
transform(data)
Deploy:
databricks bundle deploy --target dev
Documentation
Full documentation is available at boccileonardo.github.io/databricks-bundle-decorators:
- Getting Started — scaffolding, first pipeline, deploy
- How It Works — task dependencies, IoManager, task values
- Docker Deployment — pre-built container images
- API Reference —
@task,@job,IoManager, and more
Development
git clone https://github.com/<org>/databricks-bundle-decorators.git
cd databricks-bundle-decorators
uv sync
uv run pytest tests/ -v
Release
See RELEASING.md for the PyPI release process.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file databricks_bundle_decorators-0.4.0.tar.gz.
File metadata
- Download URL: databricks_bundle_decorators-0.4.0.tar.gz
- Upload date:
- Size: 24.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e32be6b4bf2ba836aab5b827e960c286f08b929b69f3fab59bf9a2f11ae07455
|
|
| MD5 |
cab9626af13085e2f78162a6d2a333ed
|
|
| BLAKE2b-256 |
4682cd2e5531da00cc07bf55150d719967f9b39bc284f6d9b6c08def3e238ff8
|
Provenance
The following attestation bundles were made for databricks_bundle_decorators-0.4.0.tar.gz:
Publisher:
publish.yaml on boccileonardo/databricks-bundle-decorators
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
databricks_bundle_decorators-0.4.0.tar.gz -
Subject digest:
e32be6b4bf2ba836aab5b827e960c286f08b929b69f3fab59bf9a2f11ae07455 - Sigstore transparency entry: 974622916
- Sigstore integration time:
-
Permalink:
boccileonardo/databricks-bundle-decorators@1a72183cf7a23286db2010c5c59b94a9c73d64fe -
Branch / Tag:
refs/tags/v0.4.0 - Owner: https://github.com/boccileonardo
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yaml@1a72183cf7a23286db2010c5c59b94a9c73d64fe -
Trigger Event:
release
-
Statement type:
File details
Details for the file databricks_bundle_decorators-0.4.0-py3-none-any.whl.
File metadata
- Download URL: databricks_bundle_decorators-0.4.0-py3-none-any.whl
- Upload date:
- Size: 41.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ae0741cb11fedf9a80416dd997481a77cf8d173f2a33916ca0046aad60aab82f
|
|
| MD5 |
4286bdc81902b63c66a34c4a72bc8886
|
|
| BLAKE2b-256 |
65f520a17d243697b456a54bf0e3657d05f1e23c74d787294db472208329d935
|
Provenance
The following attestation bundles were made for databricks_bundle_decorators-0.4.0-py3-none-any.whl:
Publisher:
publish.yaml on boccileonardo/databricks-bundle-decorators
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
databricks_bundle_decorators-0.4.0-py3-none-any.whl -
Subject digest:
ae0741cb11fedf9a80416dd997481a77cf8d173f2a33916ca0046aad60aab82f - Sigstore transparency entry: 974622925
- Sigstore integration time:
-
Permalink:
boccileonardo/databricks-bundle-decorators@1a72183cf7a23286db2010c5c59b94a9c73d64fe -
Branch / Tag:
refs/tags/v0.4.0 - Owner: https://github.com/boccileonardo
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yaml@1a72183cf7a23286db2010c5c59b94a9c73d64fe -
Trigger Event:
release
-
Statement type: