Skip to main content

ACC-02: Databricks pipeline project scaffolding and build CLI

Project description

ACC CLI โ€” Databricks Project Generator

PyPI version Python

A command-line tool to scaffold production-ready Databricks pipeline projects from templates. Built with Typer and Cookiecutter.


Table of Contents


Features

  • ๐Ÿš€ Scaffold a Databricks ETL pipeline project in seconds
  • ๐Ÿ“ Generates a fully structured Python package with src/ layout
  • โ˜๏ธ Supports multiple storage backends: UC, DBFS, S3, ADLS
  • ๐Ÿ”ง Includes Makefile, pyproject.toml, configs, Databricks job JSON, and tests out of the box
  • ๐Ÿ“ฆ Installable via pip โ€” works as a global CLI tool

Project Structure

python-deployment-package/
โ”œโ”€โ”€ .gitignore
โ”œโ”€โ”€ README.md
โ”œโ”€โ”€ pyproject.toml           # Package metadata & entry point
โ”œโ”€โ”€ requirements.txt         # Runtime dependencies
โ”œโ”€โ”€ dist/                    # Built distributions (wheel + sdist)
โ”œโ”€โ”€ docs/
โ”‚   โ”œโ”€โ”€ cli-spec.md
โ”‚   โ”œโ”€โ”€ project-structure.md
โ”‚   โ””โ”€โ”€ tech-decisions.md
โ”œโ”€โ”€ src/
โ”‚   โ””โ”€โ”€ acc_cli/             # Main CLI package
โ”‚       โ”œโ”€โ”€ __init__.py
โ”‚       โ”œโ”€โ”€ cli.py           # Typer app & entry point
โ”‚       โ”œโ”€โ”€ config.py        # Templates & storage config
โ”‚       โ”œโ”€โ”€ commands/
โ”‚       โ”‚   โ”œโ”€โ”€ __init__.py
โ”‚       โ”‚   โ””โ”€โ”€ init.py      # `acc init` command
โ”‚       โ”œโ”€โ”€ utils/
โ”‚       โ”‚   โ”œโ”€โ”€ __init__.py
โ”‚       โ”‚   โ””โ”€โ”€ utils.py     # Input collection, cookiecutter wrapper
โ”‚       โ””โ”€โ”€ templates/
โ”‚           โ””โ”€โ”€ etl/         # ETL pipeline cookiecutter template
โ”‚               โ”œโ”€โ”€ cookiecutter.json
โ”‚               โ””โ”€โ”€ {{cookiecutter.project_slug}}/
โ”‚                   โ”œโ”€โ”€ Makefile
โ”‚                   โ”œโ”€โ”€ pyproject.toml
โ”‚                   โ”œโ”€โ”€ README.md
โ”‚                   โ”œโ”€โ”€ requirements.txt
โ”‚                   โ”œโ”€โ”€ requirements-dev.txt
โ”‚                   โ”œโ”€โ”€ .gitignore
โ”‚                   โ”œโ”€โ”€ configs/
โ”‚                   โ”‚   โ”œโ”€โ”€ dev.yaml
โ”‚                   โ”‚   โ””โ”€โ”€ prod.yaml
โ”‚                   โ”œโ”€โ”€ jobs/
โ”‚                   โ”‚   โ””โ”€โ”€ databricks_job.json
โ”‚                   โ”œโ”€โ”€ src/{{cookiecutter.project_slug}}/
โ”‚                   โ”‚   โ”œโ”€โ”€ main.py
โ”‚                   โ”‚   โ”œโ”€โ”€ pipelines/
โ”‚                   โ”‚   โ”œโ”€โ”€ tasks/
โ”‚                   โ”‚   โ””โ”€โ”€ utils/
โ”‚                   โ””โ”€โ”€ tests/
โ””โ”€โ”€ tests/
    โ”œโ”€โ”€ __init__.py
    โ””โ”€โ”€ test_cli.py

Installation

pip install acc-cli

Note (macOS): If acc is not found after install, add Python's bin directory to your PATH:

echo 'export PATH="/Library/Frameworks/Python.framework/Versions/3.12/bin:$PATH"' >> ~/.zshrc
source ~/.zshrc

Usage

Start the scaffolder (default command)

acc

Explicit init subcommand

acc init

Help

acc --help
acc init --help

Interactive prompts

When you run acc init, you will be prompted for:

Prompt Description Default
Project name Name of your new project โ€”
Version Initial version 0.1.0
Author Your name โ€”
Template Project template to use etl
Storage Cloud storage backend uc
Databricks workspace URL Your workspace URL โ€”
Output directory Where to create the project current directory

Example session

ACC Project Scaffolder

Available templates:
  etl        โ€” Extract-Transform-Load pipeline
  ml         โ€” Machine learning pipeline
  utility    โ€” Utility / helper library

Project name: my-sales-pipeline
Version [0.1.0]:
Author: Jane Doe
Template [etl / ml / utility] [etl]:
Storage [uc / dbfs / s3 / adls] [uc]: s3
Databricks workspace URL: https://adb-xxxx.azuredatabricks.net
Output directory [...]: /Users/jane/projects

Project created successfully!

Location:   /Users/jane/projects/my_sales_pipeline
Storage:    s3://bucket/wheels
Wheel path: s3://bucket/wheels/my-sales-pipeline-0.1.0-py3-none-any.whl

Next steps:
  cd /Users/jane/projects/my_sales_pipeline
  make install && make test

Available Templates

Key Description Status
etl Extract-Transform-Load pipeline โœ… Available
ml Machine learning pipeline ๐Ÿšง Coming soon
utility Utility / helper library ๐Ÿšง Coming soon

Storage Options

Key Resolved path
uc dbfs:/Volumes/catalog/schema/volume
dbfs dbfs:/FileStore/wheels
s3 s3://bucket/wheels
adls abfss://container@storageaccount.dfs.core.windows.net/wheels

Development Setup

1. Clone the repository

git clone https://github.com/your-org/python-deployment-package.git
cd python-deployment-package

2. Create a virtual environment

python3 -m venv .venv
source .venv/bin/activate

3. Install in editable mode

pip install -e .

4. Verify the CLI works

acc --help

5. Run tests

pytest tests/ -v

Publishing a New Version

โš ๏ธ PyPI does not allow re-uploading the same version. Always bump the version before building.

Step 1 โ€” Bump the version in pyproject.toml

[project]
version = "0.1.2"   # โ† change this

Step 2 โ€” Clean previous builds

rm -rf dist/ build/ src/acc_cli.egg-info

Step 3 โ€” Build the distribution

python3 -m build

This creates two files inside dist/:

  • acc_cli-<version>-py3-none-any.whl โ€” binary wheel (fast install)
  • acc_cli-<version>.tar.gz โ€” source distribution

Step 4 โ€” Upload to PyPI

python3 -m twine upload dist/*

Twine will use credentials from ~/.pypirc. If that file doesn't exist, it will prompt for your API token.

Setting up ~/.pypirc (one-time)

[distutils]
index-servers = pypi

[pypi]
username = __token__
password = pypi-YOUR_API_TOKEN_HERE

Get your API token at https://pypi.org/manage/account/token/ The username must always be __token__ (literally).

Step 5 โ€” Verify on PyPI

https://pypi.org/project/acc-cli/

Versioning Convention

This project follows Semantic Versioning (MAJOR.MINOR.PATCH):

Change type Example
Bug fix / small restructure 0.1.1 โ†’ 0.1.2
New feature / new template 0.1.2 โ†’ 0.2.0
Breaking change 0.2.0 โ†’ 1.0.0

Dependencies

Package Purpose
typer>=0.12 CLI framework
rich>=13 Terminal formatting
cookiecutter>=2.6 Project scaffolding from templates

License

Proprietary โ€” All rights reserved.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

acc_cli-0.1.3.tar.gz (30.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

acc_cli-0.1.3-py3-none-any.whl (43.9 kB view details)

Uploaded Python 3

File details

Details for the file acc_cli-0.1.3.tar.gz.

File metadata

  • Download URL: acc_cli-0.1.3.tar.gz
  • Upload date:
  • Size: 30.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for acc_cli-0.1.3.tar.gz
Algorithm Hash digest
SHA256 003984bdbb464a148e6eb64050465a8a4e0cd125c8ccbce86a98cccdc9d08311
MD5 83b972687e7168d88678d0fe64206a6a
BLAKE2b-256 bf8531b87091414cdd87d35653090c6c2bd1a2bdda2e28a1b45e9650d0227cd2

See more details on using hashes here.

File details

Details for the file acc_cli-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: acc_cli-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 43.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for acc_cli-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 2b82ad5212646e16d27f6163f28b71c56857a4151fc03cec2374273a33af4de8
MD5 d5130f834bf35c08c66d1e7db285d585
BLAKE2b-256 da7ed62822e84e6e431e656ae97f558119d13420255530f6ba6a9848da37e09c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page