CLI to auto-generate dbt docs using LLMs (Ollama-first).
Project description
dbt-llm-docs
A powerful CLI tool that generates LLM-powered documentation for dbt models and columns and writes them directly into your schema.yml so the results appear in dbt docs serve.
This CLI uses Jinja2 prompt templates plus a pluggable LLM backend (Ollama or OpenAI).
Optionally, it can connect to your actual warehouse (Postgres & Redshift today) and profile real data to give the LLM deeper context for column descriptions.
🚀 Features
✅ 1. LLM-generated model + column documentation
Produces rich, clear Markdown text suitable for dbt docs.
Descriptions are written directly into schema.yml.
✅ 2. Customizable Jinja2 prompt templates
Located in <project>/prompts/.
You can fully customize the writing style, voice, or structure.
✅ 3. dbt-aware selection
Supports:
--select--exclude--tags- Glob-like patterns (
stg_*,marts.*) - Parent/child expansion (
+model_name)
✅ 4. Data-aware documentation (--use_data Y)
When enabled, the tool:
- Reads database connection info from
profiles.yml - Connects to the warehouse (Postgres & Redshift supported today)
- Executes the model’s compiled SQL
- Samples rows and computes:
- Missing %
- Unique %
- Min / Max
- Mean / Std
- Example values
- Passes these stats to the LLM for smarter, context-rich documentation
- Appends a Markdown statistics table under each column description in dbt Docs
🛠️ Support for more databases (Snowflake, BigQuery, Databricks) is coming soon.
🔒 Data Privacy Note
If --use_data Y is enabled, the profile summary (NOT raw data) is sent to the selected LLM backend.
If your organization forbids sending data outside the network, you should use:
dbt-llm-docs llm-docs-generate --backend ollama
Because Ollama runs 100% locally, ensuring no prompts or data ever leave your machine.
🧱 Architecture Overview
flowchart LR
subgraph DBT["dbt project"]
DbtModels["dbt models (*.sql)"]
SchemaYml["schema.yml (descriptions)"]
DbtProjectYml["dbt_project.yml"]
end
subgraph Target["target/ directory"]
Manifest["manifest.json"]
Catalog["catalog.json (optional)"]
end
subgraph Profiles["profiles.yml"]
ProfileDev["dev target (Postgres / Redshift)"]
end
subgraph CLI["dbt-llm-docs CLI"]
Typer["Typer CLI (init, list, generate)"]
Prompts["Jinja templates (model.md.j2, column.md.j2)"]
Selector["Model selector (--select / --exclude / --tags)"]
Profiler["Optional data profiler (--use_data Y)"]
Writer["Writes descriptions to schema.yml"]
end
subgraph LLMBackends["LLM Backends"]
Ollama["Ollama (local)"]
OpenAI["OpenAI / compatible (cloud)"]
end
subgraph Warehouse["Data Warehouse"]
DB["Postgres / Redshift"]
end
DbtModels --> Target
DbtProjectYml --> Profiles
Target --> CLI
Catalog --> CLI
Profiles --> Profiler
DB --> Profiler
Prompts --> Typer
Typer --> Selector
Selector --> LLMBackends
Profiler --> LLMBackends
LLMBackends --> Writer
Writer --> SchemaYml
SchemaYml --> DocsUI["dbt docs UI"]
⚠️ Important: Requires a Compiled dbt Project
dbt-llm-docsdepends on dbt’s generated artifacts.
Before running this tool, your dbt project must be compiled and the following files must exist in yourtarget/directory:
manifest.json— requiredcatalog.json— optional but recommended for accurate column typesGenerate them using:
dbt docs generateIf these artifacts are missing, the tool cannot discover models, columns, SQL, or metadata needed for documentation.
🤖 Installing Ollama (Recommended for Privacy)
macOS / Linux
curl -fsSL https://ollama.com/install.sh | sh
Run Ollama:
ollama serve
Download a model:
ollama pull llama3.1
Windows (WSL recommended)
Refer to: https://ollama.com/download
📦 Installation Pypi
pip install dbt-tools
📦 Installation from source
git clone <your-repo>
cd dbt-tools
python -m venv .venv
source .venv/bin/activate
pip install -e .
Requires:
manifest.json(rundbt docs generate)- Optionally
catalog.jsonfor column types
⚙️ Environment Variables
To avoid passing arguments repeatedly, you can set environment variables:
# Ollama (local)
export OLLAMA_HOST="http://ubuntu-pc.local:11434"
export OLLAMA_MODEL="llama3.1:8b-instruct-q8_0"
export TEMPERATURE=0.2
# (Future) OpenAI or compatible APIs
export OPENAI_BASE_URL="https://api.openai.com/v1"
export OPENAI_MODEL="gpt-4o-mini"
export OPENAI_API_KEY="sk-..."
🔧 Usage
Initialize templates ( Creates prompts & can be customised)
dbt-tools init --project-dir .
List models
dbt-tools list --project-dir . --target-dir target
Generate documentation (local LLM)
Default behaviour is to use ollama
dbt-tools llm-docs-generate -project-dir . --target-dir target --select dim_customers
Generate documentation with real data profiling
dbt-tools llm-docs-generate --project-dir . --target-dir target --select dim_customers --use-data Y
Generate documentation (open-ai)
dbt-tools llm-docs-generate -project-dir . --target-dir target --select dim_customers --backend openai
Generate documentation with real data profiling
dbt-tools llm-docs-generate --project-dir . --target-dir target --select dim_customers --use-data Y
🛣️ Roadmap
- More warehouse support (Snowflake, BigQuery, Databricks)
- LLM caching
- Partial regeneration
- Inline docs (
docs/*.md) generation - Lineage-aware descriptions
📄 License
MIT (or your preferred license)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file dbt_power_tools-0.1.1.tar.gz.
File metadata
- Download URL: dbt_power_tools-0.1.1.tar.gz
- Upload date:
- Size: 18.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6e4a75c0f700ca59b29c9609f2404ff00ddc342121fe3496ae12d3dde2f1e4f4
|
|
| MD5 |
e165a7e1b995eb69721444c794b07108
|
|
| BLAKE2b-256 |
a1c76cc6c4bf0c52e956fd7bceb20edcf8998a69a07ec988bb88efcf3496ac4d
|
File details
Details for the file dbt_power_tools-0.1.1-py3-none-any.whl.
File metadata
- Download URL: dbt_power_tools-0.1.1-py3-none-any.whl
- Upload date:
- Size: 21.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c2bdec2612f5ba7cd38afee94d5e2df81891e1cfa3351b0f0642d7b96c1790a7
|
|
| MD5 |
cafe993e7f73bf1bcc4db91f1048e2c3
|
|
| BLAKE2b-256 |
a826ce8144170d1840f8445f77fe5264c7347d07c5096320f2ecd8265c5e1d4d
|