CLI for Turbovault 4 DBT
Project description
TurboVault4dbt_cli
What is TurboVault4dbt_cli?
TurboVault4dbt_cli is an open-source CLI tool that automatically generates dbt models according to datavault4dbt templates. It uses a metadata input of your Data Vault 2.0 from one of the supported databases and creates ready-to-process dbt-models.
Prerequisites
- Python 3.8+
- Metadata prepared in one of the supported formats (see below)
- dbt project
- datavault4dbt dbt package
Supported Metadata Formats
- Excel files:
.xls,.xlsx - CSV files: folder of CSVs (one per sheet/table)
How does my metadata need to look like?
Your metadata needs to be stored in the following tables/worksheets/files:
- Source Data
- Standard Hubs
- Standard Links
- Non-Historized Links
- Standard Satellites
- Non-Historized Satellites
- Multi-Active Satellites
- Point-In-Time Tables
- Reference Tables
Installation
You can install TurboVault4dbt_cli directly from PyPI:
pip install turbovault4dbt
Or install from source for development:
git clone https://github.com/ScalefreeCOM/turbovault4dbt.git
cd turbovault4dbt
pip install -e .
Publishing to PyPI
-
Build the package (inside your project directory):
python -m build
-
Upload to PyPI using Twine:
pip install twine # if not already installed twine upload dist/*
-
(Optional) Test upload to TestPyPI first:
twine upload --repository testpypi dist/*
-
After upload, install your package from PyPI:
pip install turbovault4dbt
For more details, see the official Python packaging docs.
Quickstart: Using the CLI
1. Prepare your metadata
- Prepare your metadata as Excel (
.xlsor.xlsx) or as a folder of CSV files (one CSV per sheet/table, filenames matching required sheets). - (See metadata_ddl/ for example templates.)
2. Run TurboVault4dbt_cli
Basic usage:
# List all nodes in your metadata (Excel or CSV)
turbovault list -f xlsx path/to/your.xlsx
turbovault list -f csv path/to/your_csv_folder
# Generate dbt models for all nodes
turbovault run -f xlsx path/to/your.xlsx
turbovault run -f csv path/to/your_csv_folder
# Generate dbt models for selected nodes
turbovault run -f xlsx path/to/your.xlsx -s hub1 link1 sat1
turbovault run -f csv path/to/your_csv_folder -s '+hub1' '@sat1'
# Specify output directory (all generated files will go here unless overridden by metadata)
turbovault run -f xlsx path/to/your.xlsx --output-dir my_output_dir
Command reference:
turbovault run -f {xls|xlsx|csv} <input> [-s <selectors>] [--output-dir <dir>]
Generate dbt models for all or selected nodes.turbovault list -f {xls|xlsx|csv} <input> [-s <selectors>]
List resolved nodes for a selector (dry run).
Arguments:
-f, --format: Input format. Must be one of:xls,xlsx,csvinput: Path to Excel file (.xls/.xlsx) or folder containing CSV files (forcsv)-s, --select: (Optional) Node selectors (space-separated).
Examples:hub1,+sat1,hub2+,@masat3--output-dir: (Optional) Output directory for all generated files. By default, all models will be placed in this directory. If your metadata (Excel/CSV) includes anoutput_dircolumn for an asset, that asset's file will be placed in a subdirectory under--output-dir(e.g.,--output-dir modelsandoutput_dircolumn value01_RawVault/Salesresults inmodels/01_RawVault/Sales/Hub1.sql). Ifoutput_diris not present in your metadata, all files go directly into the main output directory. Both/and\are supported as path separators and absolute paths are automatically sanitized to prevent writing outside the output directory.
Selector syntax:
A+— node A and all descendants+A— node A and all ancestors@A— node A, all ancestors, and all descendants- Multiple selectors can be space-separated
Regression Testing
To run the regression test suite:
pip install -r requirements-test.txt
pytest tests/
- Add new test cases by creating folders in
tests/withinput.xlsxandexpected_output/. - Negative test cases (expected failures) are also supported.
Project Structure
src/turbovault4dbt/— all source codetests/— regression test casespyproject.toml,requirements.txt, etc. — project config in root
Releases
See PyPI Releases
License
See LICENSE
Need Help?
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file turbovault4dbt-0.2.7.tar.gz.
File metadata
- Download URL: turbovault4dbt-0.2.7.tar.gz
- Upload date:
- Size: 45.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c22d52ee7ca5784efd0a409b639cd5b916f25a8578fa7b3e8f123a608d1f19b1
|
|
| MD5 |
cef37615765c0636c16ba18f3b75c921
|
|
| BLAKE2b-256 |
6e05ca81f78cdc366d6a97a50fba386719cba8b4cb84026e2165d0daf81b0cc5
|
File details
Details for the file turbovault4dbt-0.2.7-py3-none-any.whl.
File metadata
- Download URL: turbovault4dbt-0.2.7-py3-none-any.whl
- Upload date:
- Size: 74.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e210fb4e71176674de91e03f4ac65c4358009d29a0828bf77cdf3b9b23951f8c
|
|
| MD5 |
09431c2e6522ed65bf2b93445217562d
|
|
| BLAKE2b-256 |
f9c7d512a0c3f336ffca3a4019dea9bd6b21efec2811ce6f4a0a99b4705940a2
|