Skip to main content

Automated documentation generator for dbt projects using Google Gemini AI

Project description

DBT Autodoc Documentation

dbt-autodoc is the ultimate tool for Automated Documentation and Logging for your dbt projects. It combines the power of Google Gemini AI with a robust Database Logging system to ensure your documentation is always up-to-date, accurate, and auditable.

🌟 Why dbt-autodoc?

  • 🤖 Automatic AI Documentation: Generate comprehensive descriptions for your tables and columns automatically.
  • 💾 Database Logging & History: Every description is stored in a database (duckdb or postgres). This acts as a "Source of Truth" and provides a full history of changes.
  • 🔄 Full Synchronization: Seamlessly integrates with dbt-osmosis to keep your YAML files in sync with your SQL models.
  • 🔒 Protect Manual Work: Respects human-written documentation. If you write it, we lock it.
  • 👥 Team Ready: Use Postgres to share documentation cache across your entire team.

🛠️ Setup

  1. Install:

    pip install dbt-autodoc
    
  2. Configuration: Run dbt-autodoc to generate dbt-autodoc.yml. Important: Edit company_context in this file to give the AI knowledge about your business logic.

  3. Environment Variables:

    GEMINI_API_KEY=your_api_key_here
    POSTGRES_URL=postgresql://user:pass@host:port/db (optional)
    

📋 Recommended Workflow

Follow this flow to document your project layer by layer:

  1. Update Database: First, ensure your dbt models are compiled and the database is up to date.

    dbt run
    
  2. Configure Context: Update dbt-autodoc.yml with a rich company_context. This description feeds the AI and is critical for quality output.

  3. Document Staging Layer: Target your staging models first. This generates base descriptions that can be inherited downstream.

    dbt-autodoc --generate-docs-ai --model-path models/staging
    
  4. Document Intermediate/Marts Layers: Move to the next layer. The tool will generate descriptions for new columns and models.

    dbt-autodoc --generate-docs-ai --model-path models/intermediate
    

    Repeat for models/marts, etc.

This layered approach ensures that fundamental definitions are established early and propagated correctly.

🧠 How the AI Works

When generating a description for a column or table, the AI considers multiple inputs to produce the most accurate result:

  1. Company Context: The high-level business logic defined in your config.
  2. Model SQL: The actual code of the model being documented.
  3. Existing Descriptions: Any existing documentation or comments in the file.
  4. Upstream Logic: (Implicitly via Osmosis inheritance) Context from upstream models.

It synthesizes all these inputs to write a concise, technical description.

📖 Arguments Reference

Argument Description
--generate-docs-ai 🔥 Full Auto. Runs the complete workflow: SQL generation, Osmosis sync, and YAML generation using AI.
--generate-docs 🔄 Full Sync. Runs the complete workflow using only the database cache (no AI).
--model-path Restrict processing to a specific directory (e.g. models/staging).
--generate-docs-config-ai Generate table descriptions in .sql files using AI.
--generate-docs-yml-ai Generate column descriptions in .yml files using AI.
--generate-docs-config Sync .sql files from cache (no AI).
--generate-docs-yml Sync .yml files from cache (no AI).
--cleanup-db Reset Database. Wipes the description cache and history.
--concurrency Max threads for AI/DB requests (default: 10).

📄 License

MIT License - see LICENSE for details.

🙏 Attribution

Brought to you by JustDataPlease.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dbt_autodoc-1.0.11.tar.gz (18.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dbt_autodoc-1.0.11-py3-none-any.whl (14.5 kB view details)

Uploaded Python 3

File details

Details for the file dbt_autodoc-1.0.11.tar.gz.

File metadata

  • Download URL: dbt_autodoc-1.0.11.tar.gz
  • Upload date:
  • Size: 18.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.0

File hashes

Hashes for dbt_autodoc-1.0.11.tar.gz
Algorithm Hash digest
SHA256 14da7c8f0bc05a53f656ed66475ed80168624917b561a2b58f3313e23fffa78b
MD5 1d0b3cceba1bcaddc78552bde9230067
BLAKE2b-256 788074cc089513c08957cd53174c3583e5987c12bb74485c34f44e60a1154a48

See more details on using hashes here.

File details

Details for the file dbt_autodoc-1.0.11-py3-none-any.whl.

File metadata

  • Download URL: dbt_autodoc-1.0.11-py3-none-any.whl
  • Upload date:
  • Size: 14.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.0

File hashes

Hashes for dbt_autodoc-1.0.11-py3-none-any.whl
Algorithm Hash digest
SHA256 ad92e0f29749dbc08a421c2dcd19bbc9129666fdac85989ddccb770da5ab6c24
MD5 94e51d596ff24f26492af510ca2088a6
BLAKE2b-256 ae628fa0748e9c4493e0b796590cc1d9fbfd4d362494abc3a235be2d544336d6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page