Knowit's dataops library - simplifying building data pipelines in Databricks, for both testing and production use cases. The package enables a workflow where users write their code in notebooks, and then deploy them to a Databricks workspace without stepping on each other's toes.

Project description

Brickops

DataOps framework for Databricks

Table of contents:

Getting started
Purpose
Naming functions
Deployment functions
- Auto-deploying a spark pipeline
Getting started
How to get into devcontainer from the command line
Configuration options for naming and mesh levels
Underlying philosophy

Getting Started

The package can be installed with pip:

pip install brickops

Purpose

Brickops is a framework to automatically name Databricks assets, like Unity Catalog (UC) schemas, tables and jobs, according to environment (e.g. dev, staging, prod) and domain/project/flow names (where domain, project, flow are derived from the folder path in the repository).

This enables the users (data engineers, etc) to easily develop and deploy data sets, models and pipelines, and automatically comply with organizational principles.

Brickops contains naming functions for UC assets and autojob() functions for auto-deploying jobs. In the near future autodeploy of DLT pipelines will be added.

Naming funtions

Bricksops works in the context of a folder path, representing data pipeline or flow:

orgs/acme/domains/transport/projects/taxinyc/flows/revenue/

The structure here is:

org: acme
- domain: transport
  - project: taxinyc
    - flow: revenue

Catalog name from path: catname_from_path()

Example output from naming functions from notebooks under that path:

# Name functions enables automatic env+user specific database naming
from libs.catname import catname_from_path
from libs.dbname import dbname

cat = catname_from_path()
print(f"Catalog name derived from path: {cat}")

Default output (use the domain):

Catalog name derived from path: transport

Output with optional full mesh prefixing (org_domain_project):

Catalog name derived from path: acme_transport_taxinyc

Environment specific database name: dbname()

Database (schema name) with environment prefix:

db = dbname(db="revenue", cat=cat)
print("DB name: {db}")

Output in dev environment:

DB name: transport.dev_paldevibe_main_0e7768a7_revenue

Output in prod environment:

DB name: transport.revenue

Table name: tablename()

from brickops.datamesh.naming import build_table_name as tablename

revenue_by_borough_tbl = tablename(cat=catalog, db="revenue", tbl="revenue_by_borough")
print(f"revenue_by_borough_tbl: {revenue_by_borough_tbl}")

Output in dev environment:

transport.dev_paldevibe_branchname_0e7768a7_revenue.revenue_by_borough

Output in prod environment:

transport.revenue.revenue_by_borough

In dev (and all environments except prod), the database name is prefixed with username, branch and commit ref. The automatic prefixes prevents notebooks running in development mode from overwriting production data.

Deployment functions

Auto-deploying a spark pipeline

from brickops.dataops.deploy.autojob import autojob

response = autojob()

This job will automatically name and generate a job based on a deployment.yml file in the folder, e.g. orgs/acme/domains/transport/projects/taxinyc/flows/revenue/deployment.yml.

In development, the job name created will be:

acme_transport_taxinyc_dev_abirkhan_branchname_4c6799ab_revenue

In production, the job name created will be:

acme_transport_taxinyc_revenue

The automatic prefixes in dev prevents development jobs from overwriting production jobs.

Getting started

This project uses uv. It might be easies to use the devcontainer, defined in .devcontainer, which is supported by VSCode and other toos.

If you want a local install, follow the installation instructions for your platform on the project homepage.

Next, make sure you are in the project root and run the following command in the terminal:

uv sync

This will create a virtual environment and install the required packages in it. The project configuration can be found in pyproject.toml.

You can now run the tests with

uv run pytest

How to get into devcontainer from command line

make start-devcontainer
make devcontainer-shell

Configuration options for naming and mesh levels

Naming of resources (catalogs, db/schemas, jobs, pipelines) can be configured in a file called .brickopscfg/config.yaml in the root of tour repo. example configurations can be found in tests/.brickopscfg/config.yml.

Mesh levels refers here to the granularity/depth of your organization represented in the repo structure, e.g. organization, domain and project.

An example configuration could be:

naming:
  job:
    prod: "{domain}_{project}_{env}"
    other: "{domain}_{project}_{env}_{username}_{gitbranch}_{gitshortref}"
  pipeline:
    prod: "{domain}_{project}_{env}_dlt"
    other: "{domain}_{project}_{env}_{username}_{gitbranch}_{gitshortref}_dlt"
  catalog:
    prod: "{domain}"
    other: "{domain}"
  db:
    prod: "{db}"
    other: "{env}_{username}_{gitbranch}_{gitshortref}_{db}"

Let us now see what resource names would be produced from a notebook located at something/domains/marketing/projects/projectfoo/flows/prep/foo_notebook.

For catalogs the configuration above means the domain section of a path is used, for jobs a combination of domain, project and env.

The resource names would become:

job name:
- prod: marketing_projectfoo_prod
- dev: marketing_projectfoo_env_paldevibe_branchname_82e5d310
pipeline name:
- prod: marketing_projectfoo_prod_dlt
- dev: marketing_projectfoo_env_paldevibe_branchname_82e5d310_dlt
catalog name:
- prod: sales
- dev: sales
db name for a database/schema called customers:
- prod: customers
- dev: customers_env_paldevibe_branchname_82e5d310
With org support, in the following notebook: /Repos/test@foobar.foo/dataplatform/something/org/acme/domains/sales/projects/projectfoo/flows/testflow/foo_notebook, a config of {org}_{domain}_{project}_{env} would result in acme_sales_projectfoo_prod for a production environment.

Development tools

Ruff

How to run ruff:

make ruff

Without make:

uv run ruff check --output-format=github .

Mypy

How to run mypy:

make mypy

Without make:

mypy .

Underlying philosophy

The framework is partly based on the thoughts presented in the article Data Platform Urbanism - Sustainable Plans for your Data Work.

It can be explored in the open source workshop Databricks DataOps course.

Project details

Release history Release notifications | RSS feed

This version

0.3.18

Jul 31, 2025

0.3.17

Jun 13, 2025

0.3.16

Apr 9, 2025

0.3.15

Mar 31, 2025

0.3.14

Mar 28, 2025

0.3.13

Mar 11, 2025

0.3.12

Mar 11, 2025

0.3.11

Mar 11, 2025

0.3.8

Mar 10, 2025

0.3.7

Mar 10, 2025

0.3.5

Mar 10, 2025

0.3.4

Mar 10, 2025

0.3.2

Mar 10, 2025

0.2.0

Mar 10, 2025

0.1.0

Mar 10, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

brickops-0.3.18.tar.gz (22.6 kB view details)

Uploaded Jul 31, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

brickops-0.3.18-py3-none-any.whl (29.5 kB view details)

Uploaded Jul 31, 2025 Python 3

File details

Details for the file brickops-0.3.18.tar.gz.

File metadata

Download URL: brickops-0.3.18.tar.gz
Upload date: Jul 31, 2025
Size: 22.6 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for brickops-0.3.18.tar.gz
Algorithm	Hash digest
SHA256	`1c9d0c930a3149bed1de7cf9f6852ea4f757d9830fc7764fcbd7c21cdf9cc000`
MD5	`5aec41c0215571256c9215879a568f52`
BLAKE2b-256	`0ec9145ccdca390d2ee28c4df5657f571df820d7f689bebbb6d706d896f55624`

See more details on using hashes here.

Provenance

The following attestation bundles were made for brickops-0.3.18.tar.gz:

Publisher: publish-package.yml on brickops/brickops

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: brickops-0.3.18.tar.gz
- Subject digest: 1c9d0c930a3149bed1de7cf9f6852ea4f757d9830fc7764fcbd7c21cdf9cc000
- Sigstore transparency entry: 335943320
- Sigstore integration time: Jul 31, 2025
Source repository:
- Permalink: brickops/brickops@96a99d92d672dfc9d81507098c1b26374c7571bb
- Branch / Tag: refs/tags/v0.3.18
- Owner: https://github.com/brickops
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish-package.yml@96a99d92d672dfc9d81507098c1b26374c7571bb
- Trigger Event: push

File details

Details for the file brickops-0.3.18-py3-none-any.whl.

File metadata

Download URL: brickops-0.3.18-py3-none-any.whl
Upload date: Jul 31, 2025
Size: 29.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for brickops-0.3.18-py3-none-any.whl
Algorithm	Hash digest
SHA256	`392fd6c558c356c0de90d948a9fd58f4fbc1bca17380263f4e391fea76f2f892`
MD5	`34d3656733552e71c5d2ed4dc811bbe2`
BLAKE2b-256	`a740f99e415e752f9f2b84e06a0f602a81171d8eff0cb9defabc8a62fd2658eb`

See more details on using hashes here.

Provenance

The following attestation bundles were made for brickops-0.3.18-py3-none-any.whl:

Publisher: publish-package.yml on brickops/brickops

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: brickops-0.3.18-py3-none-any.whl
- Subject digest: 392fd6c558c356c0de90d948a9fd58f4fbc1bca17380263f4e391fea76f2f892
- Sigstore transparency entry: 335943336
- Sigstore integration time: Jul 31, 2025
Source repository:
- Permalink: brickops/brickops@96a99d92d672dfc9d81507098c1b26374c7571bb
- Branch / Tag: refs/tags/v0.3.18
- Owner: https://github.com/brickops
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish-package.yml@96a99d92d672dfc9d81507098c1b26374c7571bb
- Trigger Event: push

brickops 0.3.18

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

Brickops

Getting Started

Purpose

Naming funtions

Catalog name from path: catname_from_path()

Environment specific database name: dbname()

Table name: tablename()

Deployment functions

Auto-deploying a spark pipeline

Getting started

How to get into devcontainer from command line

Configuration options for naming and mesh levels

Development tools

Ruff

Mypy

Underlying philosophy

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance