Skip to main content

Decorator-based framework for defining Databricks jobs and tasks as Python code.

Project description

databricks-bundle-decorators

Decorator-based framework for defining Databricks jobs and tasks as Python code. Define pipelines using @task, @job, and job_cluster() — they compile into Databricks Asset Bundle resources.

Why databricks-bundle-decorators?

Writing Databricks jobs in raw YAML is tedious and disconnects task logic from orchestration configuration. databricks-bundle-decorators lets you express both in Python:

  • Airflow TaskFlow-inspired pattern — define @task functions inside a @job body; dependencies are captured automatically from call arguments.
  • IoManager pattern — large data (DataFrames, datasets) flows between tasks through external storage automatically.
  • Explicit task values — small scalars (str, int, float, bool) can be passed between tasks via set_task_value / get_task_value, like Airflow XComs.
  • Pure Python — write your jobs and tasks as decorated functions, run databricks bundle deploy, and the framework generates all Databricks Job configurations for you.

Installation

uv add databricks-bundle-decorators

With cloud-specific extras for the built-in PolarsParquetIoManager:

uv add databricks-bundle-decorators[azure]  # or [aws], [gcp], [polars]

Quickstart

uv init my-pipeline && cd my-pipeline
uv add databricks-bundle-decorators[azure]
uv run dbxdec init

This scaffolds a complete pipeline project. Define your jobs in src/<package>/pipelines/:

import polars as pl

from databricks_bundle_decorators import job, job_cluster, params, task
from databricks_bundle_decorators.io_managers import PolarsParquetIoManager

io = PolarsParquetIoManager(
    base_path="abfss://lake@account.dfs.core.windows.net/staging",
)

cluster = job_cluster(
    name="small",
    spark_version="16.4.x-scala2.12",
    node_type_id="Standard_E8ds_v4",
    num_workers=1,
)

@job(
    params={"url": "https://api.github.com/events"},
    cluster=cluster,
)
def my_pipeline():
    @task(io_manager=io)
    def extract() -> pl.DataFrame:
        import requests
        return pl.DataFrame(requests.get(params["url"]).json())

    @task
    def transform(df: pl.DataFrame):
        print(df.head(10))

    data = extract()
    transform(data)

Deploy:

databricks bundle deploy --target dev

Documentation

Full documentation is available at boccileonardo.github.io/databricks-bundle-decorators:

Development

git clone https://github.com/<org>/databricks-bundle-decorators.git
cd databricks-bundle-decorators
uv sync
uv run pytest tests/ -v

Releasing

Automated (recommended)

Run the release automation action, pick patch/minor/major. The workflow bumps the version in pyproject.toml, commits, tags, builds, creates a GitHub Release, and publishes to PyPI.

Manual

uv version --bump patch  # or minor, major
git commit -am "release: v$(uv version)" && git push
# Create a GitHub Release with the new tag → publish.yaml pushes to PyPI

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

databricks_bundle_decorators-0.10.5.tar.gz (67.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

databricks_bundle_decorators-0.10.5-py3-none-any.whl (92.9 kB view details)

Uploaded Python 3

File details

Details for the file databricks_bundle_decorators-0.10.5.tar.gz.

File metadata

File hashes

Hashes for databricks_bundle_decorators-0.10.5.tar.gz
Algorithm Hash digest
SHA256 faa545c89e930bd7c2ca813ce4faaaddcb31bfad683c99c7579667c1f07f8847
MD5 45d1cf1a5a0af6c2a5ebc6264f9133aa
BLAKE2b-256 04a5f3960bbf42df4d9b4260f2e27ee14b6d7052c61bb0cd8b7ce7e48453161d

See more details on using hashes here.

Provenance

The following attestation bundles were made for databricks_bundle_decorators-0.10.5.tar.gz:

Publisher: publish.yaml on boccileonardo/databricks-bundle-decorators

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file databricks_bundle_decorators-0.10.5-py3-none-any.whl.

File metadata

File hashes

Hashes for databricks_bundle_decorators-0.10.5-py3-none-any.whl
Algorithm Hash digest
SHA256 bafc6f13d0f7efb682f66f4b4550eeb7e7307bcb6dd4221e3ff98f1c295f18cc
MD5 f072f81b7f0ac13fda803a7476557fde
BLAKE2b-256 6ba3259aa65329c58df849d0fd82f9b7c16fda9f85e723f8a85de4e81a551c6c

See more details on using hashes here.

Provenance

The following attestation bundles were made for databricks_bundle_decorators-0.10.5-py3-none-any.whl:

Publisher: publish.yaml on boccileonardo/databricks-bundle-decorators

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page