Skip to main content

CLI for Turbovault 4 DBT

Project description

TurboVault4dbt_cli

TurboVault4dbt Banner


What is TurboVault4dbt_cli?

TurboVault4dbt_cli is an open-source CLI tool that automatically generates dbt models according to datavault4dbt templates. It uses a metadata input of your Data Vault 2.0 from one of the supported databases and creates ready-to-process dbt-models.


Prerequisites


Supported Metadata Formats

  • Excel files: .xls, .xlsx
  • CSV files: folder of CSVs (one per sheet/table)

How does my metadata need to look like?

Your metadata needs to be stored in the following tables/worksheets/files:


Installation

You can install TurboVault4dbt_cli directly from PyPI:

pip install turbovault4dbt

Or install from source for development:

git clone https://github.com/ScalefreeCOM/turbovault4dbt.git
cd turbovault4dbt
pip install -e .

Publishing to PyPI

  1. Build the package (inside your project directory):

    python -m build
    
  2. Upload to PyPI using Twine:

    pip install twine  # if not already installed
    twine upload dist/*
    
  3. (Optional) Test upload to TestPyPI first:

    twine upload --repository testpypi dist/*
    
  4. After upload, install your package from PyPI:

    pip install turbovault4dbt
    

For more details, see the official Python packaging docs.


Quickstart: Using the CLI

1. Prepare your metadata

  • Prepare your metadata as Excel (.xls or .xlsx) or as a folder of CSV files (one CSV per sheet/table, filenames matching required sheets).
  • (See metadata_ddl/ for example templates.)

2. Run TurboVault4dbt_cli

Basic usage:

# List all nodes in your metadata (Excel or CSV)
turbovault list -f xlsx path/to/your.xlsx
turbovault list -f csv path/to/your_csv_folder

# Generate dbt models for all nodes
turbovault run -f xlsx path/to/your.xlsx
turbovault run -f csv path/to/your_csv_folder

# Generate dbt models for selected nodes
turbovault run -f xlsx path/to/your.xlsx -s hub1 link1 sat1
turbovault run -f csv path/to/your_csv_folder -s '+hub1' '@sat1'

# Specify output directory (all generated files will go here unless overridden by metadata)
turbovault run -f xlsx path/to/your.xlsx --output-dir my_output_dir

Command reference:

  • turbovault run -f {xls|xlsx|csv} <input> [-s <selectors>] [--output-dir <dir>]
    Generate dbt models for all or selected nodes.
  • turbovault list -f {xls|xlsx|csv} <input> [-s <selectors>]
    List resolved nodes for a selector (dry run).

Arguments:

  • -f, --format: Input format. Must be one of: xls, xlsx, csv
  • input: Path to Excel file (.xls/.xlsx) or folder containing CSV files (for csv)
  • -s, --select: (Optional) Node selectors (space-separated).
    Examples: hub1, +sat1, hub2+, @masat3
  • --output-dir: (Optional) Output directory for all generated files. By default, all models will be placed in this directory. If your metadata (Excel/CSV) includes an output_dir column for an asset, that asset's file will be placed in a subdirectory under --output-dir (e.g., --output-dir models and output_dir column value 01_RawVault/Sales results in models/01_RawVault/Sales/Hub1.sql). If output_dir is not present in your metadata, all files go directly into the main output directory. Both / and \ are supported as path separators and absolute paths are automatically sanitized to prevent writing outside the output directory.

Selector syntax:

  • A+ — node A and all descendants
  • +A — node A and all ancestors
  • @A — node A, all ancestors, and all descendants
  • Multiple selectors can be space-separated

Regression Testing

To run the regression test suite:

pip install -r requirements-test.txt
pytest tests/
  • Add new test cases by creating folders in tests/ with input.xlsx and expected_output/.
  • Negative test cases (expected failures) are also supported.

Project Structure

  • src/turbovault4dbt/ — all source code
  • tests/ — regression test cases
  • pyproject.toml, requirements.txt, etc. — project config in root

Releases

See PyPI Releases


License

See LICENSE


Need Help?

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

turbovault4dbt-0.2.7.tar.gz (45.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

turbovault4dbt-0.2.7-py3-none-any.whl (74.8 kB view details)

Uploaded Python 3

File details

Details for the file turbovault4dbt-0.2.7.tar.gz.

File metadata

  • Download URL: turbovault4dbt-0.2.7.tar.gz
  • Upload date:
  • Size: 45.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.3

File hashes

Hashes for turbovault4dbt-0.2.7.tar.gz
Algorithm Hash digest
SHA256 c22d52ee7ca5784efd0a409b639cd5b916f25a8578fa7b3e8f123a608d1f19b1
MD5 cef37615765c0636c16ba18f3b75c921
BLAKE2b-256 6e05ca81f78cdc366d6a97a50fba386719cba8b4cb84026e2165d0daf81b0cc5

See more details on using hashes here.

File details

Details for the file turbovault4dbt-0.2.7-py3-none-any.whl.

File metadata

  • Download URL: turbovault4dbt-0.2.7-py3-none-any.whl
  • Upload date:
  • Size: 74.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.3

File hashes

Hashes for turbovault4dbt-0.2.7-py3-none-any.whl
Algorithm Hash digest
SHA256 e210fb4e71176674de91e03f4ac65c4358009d29a0828bf77cdf3b9b23951f8c
MD5 09431c2e6522ed65bf2b93445217562d
BLAKE2b-256 f9c7d512a0c3f336ffca3a4019dea9bd6b21efec2811ce6f4a0a99b4705940a2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page