AI-powered dbt helper with modern GPT models and enhanced prompts

Project description

DBT AI

An application that allows AI powered DBT development and recommendations for your DBT models.

🚀 What's New in v0.4.0

Agent-First Design: JSON output is now the default format for seamless integration with AI coding agents
Structured Output: Machine-parseable JSON responses that work with any agent or automation tool
Fast Metadata Checks: New --metadata-only flag for quick metadata coverage analysis without AI processing
Modern OpenAI API: Upgraded to the latest OpenAI API (v1.x) with support for GPT-4o and GPT-4o-mini
Enhanced Prompts: Completely rewritten prompts with better structure and clearer guidelines
Smart Model Selection: Automatically uses GPT-4o for advanced suggestions and GPT-4o-mini for basic ones
Better Error Handling: Improved error handling and fallback mechanisms
Configuration Options: Environment variables for customizing AI models and settings
Maintained Compatibility: All existing CLI commands work exactly the same

Features

Agent-Ready Output: Structured JSON output by default for seamless automation and agent integration
AI-Powered Analysis: Scans all dbt models and generates recommendations for each model
- Basic recommendations for quick improvements (default: GPT-4o-mini)
- Advanced recommendations for complex optimizations (GPT-4o)
Fast Metadata Checking: Instantly verify which models are missing documentation
Model Creation: Generate new dbt models from natural language prompts
Lineage Analysis: Understand model dependencies and data flow
Multi-Database Support: Works with Snowflake, PostgreSQL, Redshift, and BigQuery
Human-Readable Reports: Optional HTML reports for visual analysis

Installation

First time installation:

pip install dbt-ai

To upgrade to the latest version:

pip install dbt-ai --upgrade

To install a specific version:

pip install dbt-ai==<version>

Replace <version> with your desired version e.g. 0.2.0. You can view available versions in the Releases section of this repo or on our Pypi page. WARNING: This is an early phase application that may still contain bugs

Prerequisites

In order to benefit from AI features, you need your own OpenAI API Key with the initial version of this application
- Once you sign up to OpenAI you can create an API key.
- Trial version gives you a certain amount of credits allowing you to make many API calls
- Usage beyond the trial credits require billing details. API usage pricing provides more info
Ideally you already have dbt project to test this out on
Python 3.10 or greater is required

Usage

Setting up your API key is needed for all the AI features

Set up your OpenAI API key as an environment variable:

export OPENAI_API_KEY="your_openai_api_key"

Configuration

dbt-ai now supports environment variables for customizing AI model selection and behavior:

AI Model Configuration

# Basic suggestions model (default: gpt-4o-mini)
export DBT_AI_BASIC_MODEL="gpt-4o-mini"

# Advanced suggestions model (default: gpt-4o)  
export DBT_AI_ADVANCED_MODEL="gpt-4o"

# Fallback model for compatibility (default: gpt-3.5-turbo)
export DBT_AI_FALLBACK_MODEL="gpt-3.5-turbo"

API Settings

# Maximum tokens per API call (default: 4000)
export DBT_AI_MAX_TOKENS="4000"

# Temperature for AI responses (default: 0.1)
export DBT_AI_TEMPERATURE="0.1"

Recommended Model Selection

For cost-conscious users: Use gpt-3.5-turbo for both basic and advanced
For best quality: Use gpt-4o-mini for basic and gpt-4o for advanced (default)
For organizations: Consider gpt-4 models for production use

Quick Start

Basic Analysis (JSON Output)

Get structured analysis perfect for agents and automation:

dbt-ai -f path/to/dbt/project

Example with current directory:

dbt-ai -f .

This outputs machine-parseable JSON with model analysis, suggestions, and metadata coverage.

Fast Metadata Check

Quick metadata coverage check without AI processing:

dbt-ai -f . --metadata-only

Human-Readable Reports

Generate visual HTML reports for manual review:

dbt-ai -f . --output text

Advanced Usage

Output Formats

# JSON output (default) - perfect for agents
dbt-ai -f . --output json

# Text output - generates HTML report for humans
dbt-ai -f . --output text

Database Configuration

Specify your database type for optimized suggestions:

dbt-ai -f . -d snowflake    # Default
dbt-ai -f . -d postgres     # PostgreSQL
dbt-ai -f . -d redshift     # Amazon Redshift
dbt-ai -f . -d bigquery     # Google BigQuery

Advanced AI Recommendations

Request more sophisticated optimization suggestions:

dbt-ai -f . -a              # Short form
dbt-ai -f . --advanced-rec  # Long form

Combined Examples

# Advanced PostgreSQL analysis with JSON output
dbt-ai -f . -d postgres -a

# Quick metadata check only
dbt-ai -f . --metadata-only

# Full analysis with HTML report
dbt-ai -f . --output text -a

Create DBT Models from prompt (AI)

This feature lets you specify a prompt, which creates AI generated DBT model files in the models/ directory of the specified dbt project. The AI model has access to your sources.yml file, if you wish to refer to any sources in your prompt. Being specific will provide better results.

Run the application with the --create-models flag to specify the prompt you wish to use to create your DBT models

dbt-ai -f path/to/dbt/project --create-models 'your prompt goes here'

Here is an example:

dbt-ai -f . --create-models 'Write me a model that uses all the sources available in sources.yml and joins them together using the id column'

Output Formats

JSON Output (Default)

Perfect for AI agents, automation tools, and programmatic analysis:

{
  "operation": "full_analysis",
  "project_path": "./sample-dbt-project",
  "total_models": 3,
  "models": [
    {
      "name": "customer_summary",
      "has_metadata": true,
      "suggestions": "Consider adding data freshness tests...",
      "dependencies": ["raw_customers", "raw_orders"]
    }
  ],
  "metadata_coverage_percent": 66.7,
  "lineage_description": "customer_summary depends on raw_customers, raw_orders..."
}

HTML Report

Visual reports for human analysis when using --output text:

🌐 View Live Demo Report

The HTML report includes:

🤖 AI-powered improvement suggestions for each dbt model
📋 Metadata coverage analysis showing which models need documentation
🎨 Professional styling with responsive design
🔗 Model lineage information and dependencies

The demo above shows the actual output generated from the sample dbt project included in this repository.

Agent Integration

dbt-ai is designed to work seamlessly with AI coding agents like Claude Code, Cursor, and GitHub Copilot. The JSON output format allows agents to:

Analyze dbt projects programmatically without human intervention
Identify optimization opportunities across multiple models
Check metadata coverage for documentation completeness
Understand model lineage for impact analysis
Generate improvement suggestions based on current best practices

Example Agent Workflow

# Agent runs analysis
dbt-ai -f ./dbt-project --output json > analysis.json

# Agent parses results to identify issues
cat analysis.json | jq '.models[] | select(.has_metadata == false) | .name'

# Agent can then take corrective actions based on structured data

Changelog

v0.4.0 (Latest)

Agent-first design: JSON output is now the default format
Fast metadata checks: New --metadata-only flag for quick coverage analysis
Structured output: Machine-parseable JSON for seamless automation
Agent integration: Designed for seamless use with AI coding agents
Bug fixes: Resolved directory handling issues in YAML file discovery
Enhanced error handling: Better safety checks for file processing

v0.3.0

Major upgrade: Modernized OpenAI API integration (v1.x)
Enhanced AI prompts: Completely rewritten prompts with better structure and context
Smart model selection: GPT-4o for advanced suggestions, GPT-4o-mini for basic ones
Structured responses: JSON-based output parsing with Pydantic validation
Configuration options: Environment variables for model and API customization
Improved error handling: Better fallbacks and error messages
Backward compatibility: All existing commands work unchanged
Fixed tests: All test suite now passes correctly

v0.2.x (Previous)

Basic OpenAI integration with GPT-3.5-turbo
Simple prompt-based suggestions
Basic and advanced recommendation modes
HTML report generation
Metadata checking functionality

Contributing

We welcome contributions to the project! Please feel free to open issues or submit pull requests with your improvements and suggestions.

See CONTRIBUTING.md to get started and develop in this repo.

Project details

Release history Release notifications | RSS feed

This version

0.4.0

Sep 28, 2025

0.3.1

Sep 28, 2025

0.3.0

Sep 28, 2025

0.2.20

May 12, 2023

0.2.18

May 12, 2023

0.2.15

May 12, 2023

0.2.14

May 12, 2023

0.2.13

May 12, 2023

0.2.12

May 12, 2023

0.2.11

May 12, 2023

0.2.10

May 12, 2023

0.2.7

May 12, 2023

0.2.6

May 12, 2023

0.2.5

May 12, 2023

0.2.4

May 11, 2023

0.2.2

May 11, 2023

0.2.1

May 11, 2023

0.2.0

May 10, 2023

0.1.3a0 pre-release

May 7, 2023

0.1.2a0 pre-release

May 7, 2023

0.1.1a0 pre-release

May 7, 2023

0.1.0a0 pre-release

May 7, 2023

0.0.10a0 pre-release

May 7, 2023

0.0.9a0 pre-release

May 6, 2023

0.0.8a0 pre-release

May 6, 2023

0.0.7a0 pre-release yanked

May 6, 2023