Skip to main content

ACC-02: Databricks pipeline project scaffolding and build CLI

Project description

ACC CLI โ€” Databricks Project Generator

PyPI version Python

A command-line tool to scaffold production-ready Databricks pipeline projects from templates. Built with Typer and Cookiecutter.


Table of Contents


Features

  • ๐Ÿš€ Scaffold a Databricks ETL pipeline project in seconds
  • ๐Ÿ“ Generates a fully structured Python package with src/ layout
  • โ˜๏ธ Supports multiple storage backends: UC, DBFS, S3, ADLS
  • ๐Ÿ”ง Includes Makefile, pyproject.toml, configs, Databricks job JSON, and tests out of the box
  • ๐Ÿ“ฆ Installable via pip โ€” works as a global CLI tool

Project Structure

python-deployment-package/
โ”œโ”€โ”€ .gitignore
โ”œโ”€โ”€ README.md
โ”œโ”€โ”€ pyproject.toml           # Package metadata & entry point
โ”œโ”€โ”€ requirements.txt         # Runtime dependencies
โ”œโ”€โ”€ dist/                    # Built distributions (wheel + sdist)
โ”œโ”€โ”€ docs/
โ”‚   โ”œโ”€โ”€ cli-spec.md
โ”‚   โ”œโ”€โ”€ project-structure.md
โ”‚   โ””โ”€โ”€ tech-decisions.md
โ”œโ”€โ”€ src/
โ”‚   โ””โ”€โ”€ acc_cli/             # Main CLI package
โ”‚       โ”œโ”€โ”€ __init__.py
โ”‚       โ”œโ”€โ”€ cli.py           # Typer app & entry point
โ”‚       โ”œโ”€โ”€ config.py        # Templates & storage config
โ”‚       โ”œโ”€โ”€ commands/
โ”‚       โ”‚   โ”œโ”€โ”€ __init__.py
โ”‚       โ”‚   โ””โ”€โ”€ init.py      # `acc init` command
โ”‚       โ”œโ”€โ”€ utils/
โ”‚       โ”‚   โ”œโ”€โ”€ __init__.py
โ”‚       โ”‚   โ””โ”€โ”€ utils.py     # Input collection, cookiecutter wrapper
โ”‚       โ””โ”€โ”€ templates/
โ”‚           โ””โ”€โ”€ etl/         # ETL pipeline cookiecutter template
โ”‚               โ”œโ”€โ”€ cookiecutter.json
โ”‚               โ””โ”€โ”€ {{cookiecutter.project_slug}}/
โ”‚                   โ”œโ”€โ”€ Makefile
โ”‚                   โ”œโ”€โ”€ pyproject.toml
โ”‚                   โ”œโ”€โ”€ README.md
โ”‚                   โ”œโ”€โ”€ requirements.txt
โ”‚                   โ”œโ”€โ”€ requirements-dev.txt
โ”‚                   โ”œโ”€โ”€ .gitignore
โ”‚                   โ”œโ”€โ”€ configs/
โ”‚                   โ”‚   โ”œโ”€โ”€ dev.yaml
โ”‚                   โ”‚   โ””โ”€โ”€ prod.yaml
โ”‚                   โ”œโ”€โ”€ jobs/
โ”‚                   โ”‚   โ””โ”€โ”€ databricks_job.json
โ”‚                   โ”œโ”€โ”€ src/{{cookiecutter.project_slug}}/
โ”‚                   โ”‚   โ”œโ”€โ”€ main.py
โ”‚                   โ”‚   โ”œโ”€โ”€ pipelines/
โ”‚                   โ”‚   โ”œโ”€โ”€ tasks/
โ”‚                   โ”‚   โ””โ”€โ”€ utils/
โ”‚                   โ””โ”€โ”€ tests/
โ””โ”€โ”€ tests/
    โ”œโ”€โ”€ __init__.py
    โ””โ”€โ”€ test_cli.py

Installation

pip install acc-cli

Note (macOS): If acc is not found after install, add Python's bin directory to your PATH:

echo 'export PATH="/Library/Frameworks/Python.framework/Versions/3.12/bin:$PATH"' >> ~/.zshrc
source ~/.zshrc

Usage

Start the scaffolder (default command)

acc

Explicit init subcommand

acc init

Help

acc --help
acc init --help

Interactive prompts

When you run acc init, you will be prompted for:

Prompt Description Default
Project name Name of your new project โ€”
Version Initial version 0.1.0
Author Your name โ€”
Template Project template to use etl
Storage Cloud storage backend uc
Databricks workspace URL Your workspace URL โ€”
Output directory Where to create the project current directory

Example session

ACC Project Scaffolder

Available templates:
  etl        โ€” Extract-Transform-Load pipeline
  ml         โ€” Machine learning pipeline
  utility    โ€” Utility / helper library

Project name: my-sales-pipeline
Version [0.1.0]:
Author: Jane Doe
Template [etl / ml / utility] [etl]:
Storage [uc / dbfs / s3 / adls] [uc]: s3
Databricks workspace URL: https://adb-xxxx.azuredatabricks.net
Output directory [...]: /Users/jane/projects

Project created successfully!

Location:   /Users/jane/projects/my_sales_pipeline
Storage:    s3://bucket/wheels
Wheel path: s3://bucket/wheels/my-sales-pipeline-0.1.0-py3-none-any.whl

Next steps:
  cd /Users/jane/projects/my_sales_pipeline
  make install && make test

Available Templates

Key Description Status
etl Extract-Transform-Load pipeline โœ… Available
ml Machine learning pipeline ๐Ÿšง Coming soon
utility Utility / helper library ๐Ÿšง Coming soon

Storage Options

Key Resolved path
uc dbfs:/Volumes/catalog/schema/volume
dbfs dbfs:/FileStore/wheels
s3 s3://bucket/wheels
adls abfss://container@storageaccount.dfs.core.windows.net/wheels

Development Setup

1. Clone the repository

git clone https://github.com/your-org/python-deployment-package.git
cd python-deployment-package

2. Create a virtual environment

python3 -m venv .venv
source .venv/bin/activate

3. Install in editable mode

pip install -e .

4. Verify the CLI works

acc --help

5. Run tests

pytest tests/ -v

Publishing a New Version

โš ๏ธ PyPI does not allow re-uploading the same version. Always bump the version before building.

Step 1 โ€” Bump the version in pyproject.toml

[project]
version = "0.1.2"   # โ† change this

Step 2 โ€” Clean previous builds

rm -rf dist/ build/ src/acc_cli.egg-info

Step 3 โ€” Build the distribution

python3 -m build

This creates two files inside dist/:

  • acc_cli-<version>-py3-none-any.whl โ€” binary wheel (fast install)
  • acc_cli-<version>.tar.gz โ€” source distribution

Step 4 โ€” Upload to PyPI

python3 -m twine upload dist/*

Twine will use credentials from ~/.pypirc. If that file doesn't exist, it will prompt for your API token.

Setting up ~/.pypirc (one-time)

[distutils]
index-servers = pypi

[pypi]
username = __token__
password = pypi-YOUR_API_TOKEN_HERE

Get your API token at https://pypi.org/manage/account/token/ The username must always be __token__ (literally).

Step 5 โ€” Verify on PyPI

https://pypi.org/project/acc-cli/

Versioning Convention

This project follows Semantic Versioning (MAJOR.MINOR.PATCH):

Change type Example
Bug fix / small restructure 0.1.1 โ†’ 0.1.2
New feature / new template 0.1.2 โ†’ 0.2.0
Breaking change 0.2.0 โ†’ 1.0.0

Dependencies

Package Purpose
typer>=0.12 CLI framework
rich>=13 Terminal formatting
cookiecutter>=2.6 Project scaffolding from templates

License

Proprietary โ€” All rights reserved.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

acc_cli-0.1.2.tar.gz (18.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

acc_cli-0.1.2-py3-none-any.whl (22.2 kB view details)

Uploaded Python 3

File details

Details for the file acc_cli-0.1.2.tar.gz.

File metadata

  • Download URL: acc_cli-0.1.2.tar.gz
  • Upload date:
  • Size: 18.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for acc_cli-0.1.2.tar.gz
Algorithm Hash digest
SHA256 4333af367d25ca08d098d9adc56d5e033fc6257df21985d5315f502c7136eaaf
MD5 d80d0ea1109913201c282571599b91d9
BLAKE2b-256 c19e572dc0a66303df3f741075425b8d12d8440ff33b908e6722ac55587d5bce

See more details on using hashes here.

File details

Details for the file acc_cli-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: acc_cli-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 22.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for acc_cli-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 eb46fac858a0e17ca13bc1a6ffc5a9cca3fb99c7e7629e9c765a2b81c0e9ca27
MD5 703aefd6459682a96d7255d5592fb83a
BLAKE2b-256 7f29330acf741d4915df4c41dc2dcc49254f3ea6a9f37b61e72718e9dcab3c71

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page