Skip to main content

Automated documentation generator for dbt projects using Google Gemini AI

Project description

DBT Autodoc Documentation

dbt-autodoc is the ultimate tool for Automated Documentation and Logging for your dbt projects. It combines the power of Google Gemini AI with a robust Database Logging system to ensure your documentation is always up-to-date, accurate, and auditable.

🌟 Why dbt-autodoc?

  • 🤖 Automatic AI Documentation: Generate comprehensive descriptions for your tables and columns automatically.
  • 💾 Database Logging & History: Every description is stored in a database (duckdb or postgres). This acts as a "Source of Truth" and provides a full history of changes.
  • 🔄 Full Synchronization: Seamlessly integrates with dbt-osmosis to keep your YAML files in sync with your SQL models.
  • 🔒 Protect Manual Work: Respects human-written documentation. If you write it, we lock it.
  • 👥 Team Ready: Use Postgres to share documentation cache across your entire team.

🛠️ Setup

  1. Install:

    pip install dbt-autodoc
    
  2. Configuration: Run dbt-autodoc to generate dbt-autodoc.yml. Important: Edit company_context in this file to give the AI knowledge about your business logic.

  3. Environment Variables:

    GEMINI_API_KEY=your_api_key_here
    POSTGRES_URL=postgresql://user:pass@host:port/db (optional)
    

🚀 Quick Start

The easiest way to document your entire project in one go:

1. Automatic Mode (Recommended)

Generates table descriptions, syncs columns, and generates column descriptions using AI.

dbt-autodoc --generate-docs-ai

2. Manual Mode (No AI)

Syncs your project structure and restores descriptions from the database cache without calling AI.

dbt-autodoc --generate-docs

📋 Detailed Workflow

If you prefer granular control, you can run steps individually:

  1. Generate Table Descriptions:

    dbt-autodoc --generate-docs-config-ai
    
  2. Generate Column Descriptions:

    dbt-autodoc --generate-docs-yml-ai
    
  3. Sync Structure Only (No AI):

    dbt-autodoc --generate-docs-yml
    

📖 Arguments Reference

Argument Description
--generate-docs-ai 🔥 Full Auto. Runs the complete workflow: SQL generation, Osmosis sync, and YAML generation using AI.
--generate-docs 🔄 Full Sync. Runs the complete workflow using only the database cache (no AI).
--model-path Restrict processing to a specific directory (e.g. models/staging).
--generate-docs-config-ai Generate table descriptions in .sql files using AI.
--generate-docs-yml-ai Generate column descriptions in .yml files using AI.
--generate-docs-config Sync .sql files from cache (no AI).
--generate-docs-yml Sync .yml files from cache (no AI).
--cleanup-db Reset Database. Wipes the description cache and history.
--concurrency Max threads for AI/DB requests (default: 10).

🧠 Smart Caching System

dbt-autodoc isn't just a wrapper for GPT. It maintains a sophisticated state:

  1. doc_cache Table: Stores the current active description.
  2. doc_cache_log Table: logs every single change to a description, by whom, and when.

This means if you accidentally delete a description from your code, dbt-autodoc --generate-docs will restore it instantly from the database history!

📄 License

MIT License - see LICENSE for details.

🙏 Attribution

Brought to you by JustDataPlease.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dbt_autodoc-1.0.10.tar.gz (17.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dbt_autodoc-1.0.10-py3-none-any.whl (14.1 kB view details)

Uploaded Python 3

File details

Details for the file dbt_autodoc-1.0.10.tar.gz.

File metadata

  • Download URL: dbt_autodoc-1.0.10.tar.gz
  • Upload date:
  • Size: 17.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.0

File hashes

Hashes for dbt_autodoc-1.0.10.tar.gz
Algorithm Hash digest
SHA256 28ab0a3b1b3a5d9081a44ad9e001c2ea7eec4fbff5b3d677ccf5f2564b4583c9
MD5 3c7f10841464948785d1591faf560906
BLAKE2b-256 e87ed2ae069808c93cde9406491997de666b388983066076f7a1cb2deb3c0017

See more details on using hashes here.

File details

Details for the file dbt_autodoc-1.0.10-py3-none-any.whl.

File metadata

  • Download URL: dbt_autodoc-1.0.10-py3-none-any.whl
  • Upload date:
  • Size: 14.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.0

File hashes

Hashes for dbt_autodoc-1.0.10-py3-none-any.whl
Algorithm Hash digest
SHA256 220f7fbe1165ea92e0b7953157745f63ca3cfc74aa1887394389ad2512589a73
MD5 fe5bcb4ae55c3d51f0dfa9f3393fd4f0
BLAKE2b-256 cfd1f33695d79bd64a8d442111dfafff82094acd5cb23fb0563723d8e0cabff4

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page