Skip to main content

Python CLI for automating the application of consistent field definitions to large multi-layered dbt projects.

Project description

tasman_logo tasman_logo


We are the boutique analytics consultancy that turns disorganised data into real business value. Get in touch to learn more about how Tasman can help solve your organisations data challenges.

dbt-datadict

dbt-datadict is a CLI tool that provides helpful functions to improve the speed and efficiency of managing column-level documentation across large dbt projects.

Key Features:

  1. Rapid creation of model yaml files, leveraging dbt-labs/codegen 💥 (no more copy/pasting from the terminal 🙌)
  2. In-place updates to model yaml on schema changes 🧙
  3. Consolidatation of column descriptions into a data dictionary 📓
  4. Keeps column descriptions in sync with a single command 🔃

Installation ⏬

Install dbt-datadict using

```bash
$ python -m pip install dbt-datadict
```

Getting Started 🚀

Full user guide 🧑‍🏫

Command: generate

This command generates yaml files using the dbt-codegen package. Where it finds existing model yaml files, it will merge the full column lists. For missing models, it will create a separate model yaml file using the name provided.

Warning ⚠️
This command will only run in a valid dbt project with the dbt-labs/codegen dbt package installed.

Usage:

$ datadict generate [-D <DIRECTORY>] [-f <NAME>] 

Options:

  • -D, --directory <DIRECTORY>: Directory to search for models. Default: 'models/'.
  • -f, --file <NAME>: The yaml file to store new model configurations that aren't referenced in an existing yaml file.
  • --sort: Triggers the generated YAML files to be sorted alphabetically (on by default).
  • --unique-model-yaml: Creates one YAML for each model with the same name as the model.

Command: apply

This command applies data dictionary updates to all model YAML files in the specified directory and its subdirectories.

Usage:

$ datadict apply [-D <DIRECTORY>] [-d <DICTIONARY>] 

Options:

  • -D, --directory <DIRECTORY>: Directory to search for fields and apply the dictionary to. Default: 'models/'.
  • -d, --dictionary <DICTIONARY>: Location of the dictionary file. Default: 'datadictionary.yml'.

⚠️ Important Note ⚠️

It is highly recommend to only use this library in a version controlled environment, such as git. Additionally, please ensure that you have backed up your model YAML files and data dictionary before applying any updates. The application modifies files in place and does not create backups automatically.

Use this application responsibly and verify the updates before proceeding.

Contributing

We encourage you to contribute to dbt Data Dictionary! Please check out our Contributing to dbt Data Dictionary guide for guidelines about how to proceed.

License

dbt Data Dictionary is released under the GNU General Public License v3.0. See LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dbt_datadict-0.3.0.tar.gz (27.3 kB view details)

Uploaded Source

Built Distribution

dbt_datadict-0.3.0-py3-none-any.whl (29.1 kB view details)

Uploaded Python 3

File details

Details for the file dbt_datadict-0.3.0.tar.gz.

File metadata

  • Download URL: dbt_datadict-0.3.0.tar.gz
  • Upload date:
  • Size: 27.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.12.3 Darwin/23.4.0

File hashes

Hashes for dbt_datadict-0.3.0.tar.gz
Algorithm Hash digest
SHA256 1cc81b4e98bdf52f9a1e80c700bfc6a3a42fc03c98024e2be4cf363b64c7b0ad
MD5 37306129512ac555806de8d18585431f
BLAKE2b-256 b5603b15d60dbf7e4ad76177efefc561d503bec5f064cbc598b385d21f739fff

See more details on using hashes here.

File details

Details for the file dbt_datadict-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: dbt_datadict-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 29.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.12.3 Darwin/23.4.0

File hashes

Hashes for dbt_datadict-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 2c2245ff4a3d7f55183300d0a88f89325a5d9f806bf9ae11a12dcbcdd0ee53b0
MD5 e971e5717e7053f9c89cae479b753cc2
BLAKE2b-256 f2483bcc777b48c9f9095f4c5c69ef20763f69e628ac5f03c923b0757714f37b

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page