Skip to main content

Generate and update dbt schema.yml files from a template, manifest, and database

Project description

dbt-schemify

Generate and update dbt schema.yml files automatically from a template, your dbt manifest, and live database columns.

How it works

Three sources are merged in priority order (highest first):

  1. Existing schema.yml — values already there are never overwritten
  2. manifest.json — fills fields marked with the schemify sentinel
  3. .schemify.yml template — defines which fields to include and their static defaults

The sentinel value schemify in the template means "auto-populate this field from the manifest or database".

Installation

pip install dbt-schemify

With your database adapter:

pip install "dbt-schemify[snowflake]"
pip install "dbt-schemify[postgres]"
pip install "dbt-schemify[bigquery]"
pip install "dbt-schemify[duckdb]"

Quick start

1. Compile your dbt project to get a manifest:

dbt compile

2. Initialise schemify (first time only — creates config and template):

schemify --init

This creates two files in your project root:

  • .schemify-config.yml — default options for every run (each, no-db, paths, profile…)
  • .schemify.yml — template that controls which fields appear in generated schema files

Edit both to match your project, then run:

schemify

Reads .schemify.yml, .schemify-config.yml, target/manifest.json, and ~/.dbt/profiles.yml automatically. Writes schema.yml next to each model's SQL file (grouped by folder).

Usage

schemify [options]

Setup:
  --init                 Create .schemify-config.yml and .schemify.yml, then exit.
                         Run this once before using schemify for the first time.

Schema generation:
  --schema PATH          Write all models into a single schema.yml at PATH.
                         If omitted, a schema.yml is created next to each model's SQL file.
  --manifest PATH        Path to manifest.json
                         Default: <project-dir>/target/manifest.json
  --template PATH        Path to .schemify.yml template
                         Default: <project-dir>/.schemify.yml
  --project-dir DIR      dbt project root. Default: current directory
  --profile NAME         dbt profile name. Default: read from dbt_project.yml
  --target NAME          dbt target (e.g. dev, prod). Default: profile default
  --profiles-dir DIR     Directory containing profiles.yml. Default: ~/.dbt/
  -s / --select          Filter models by name or tag. Space-separated.
                         Examples: -s orders   -s tag:marketing   -s tag:finance orders
  --each                 Write one <model_name>.yml per model instead of one schema.yml per folder
  --no-db                Skip database connection; no column fetching
  -y / --yes             Skip confirmation prompts (useful for CI)

Diagnostics:
  --info                 Show resolved paths and configuration, then exit
  --debug-db             Show DB connection config (password masked) and test the connection, then exit

dbt-schemify also works as an alias for backward compatibility.

Configuration file

.schemify-config.yml (created by schemify --init) lets you set default options so you don't have to repeat them on every run. CLI arguments always override config values.

# Output mode
each: false          # write one <model>.yml per model instead of schema.yml per folder
no_db: false         # skip database connection; no column fetching

# Paths ('default' = auto-resolved)
manifest: default    # manifest.json path;      auto: <project-dir>/target/manifest.json
template: default    # .schemify.yml path;       auto: <project-dir>/.schemify.yml
profiles_dir: default  # profiles.yml directory; auto: ~/.dbt/

# dbt connection ('default' = auto-resolved)
profile: default     # dbt profile name;  auto: read from dbt_project.yml
target: default      # dbt target name;   auto: profile default

Examples

# First-time setup: create config and template files
schemify --init

# Auto-discover: write schema.yml next to every model's SQL file
# (asks for confirmation before writing)
schemify

# Check which paths schemify is using
schemify --info

# Check DB connection (shows masked config + runs a test query)
schemify --debug-db

# Only models with a specific tag (one schema.yml per directory)
schemify -s tag:marketing

# Only specific models by name
schemify -s orders customers

# Single model — automatically gets its own <model_name>.yml
schemify -s orders

# Mix names and tags
schemify -s tag:finance orders

# All matching models into one explicit file
schemify --schema models/marketing/schema.yml -s tag:marketing

# One file per model named after the model (e.g. orders.yml, customers.yml)
schemify --each

# Without DB connection (manifest data only)
schemify --no-db

# Skip confirmation (e.g. in CI)
schemify --yes

# Custom paths
schemify \
  --manifest target/manifest.json \
  --template .schemify.yml

Confirmation and conflict detection

When you run plain schemify (no --select or --schema), schemify lists all schema files it is about to create or update and asks for confirmation before proceeding. Pass -y / --yes to skip this in CI.

If schemify detects that your current output mode conflicts with existing files (e.g. you switch from per-folder schema.yml to per-model files, or vice versa), it warns you and asks whether to continue.

Default template

.schemify.yml (created by schemify --init) controls which fields appear in generated schemas:

version: '1.0'
models:
  - name: schemify
    description: schemify      # filled from manifest
    meta:
      owner: analytics         # static default applied to all models
    config:
      enabled: true            # static default
    columns:
      - name: schemify
        data_type: schemify    # filled from DB
        description: schemify  # left empty for humans to fill in
        meta:
          gdpr_tags: schemify

Merge rules

Situation Result
Field exists in schema.yml Kept as-is
Field is schemify sentinel + value in manifest Filled from manifest
Field is schemify sentinel + column data_type Filled from DB
Field has a static value in template Used as default
Field not in template Not included in output
Column in DB but not in existing schema Added using template column structure
Column in existing schema but not in DB Preserved

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dbt_schemify-0.7.0.tar.gz (30.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dbt_schemify-0.7.0-py3-none-any.whl (24.6 kB view details)

Uploaded Python 3

File details

Details for the file dbt_schemify-0.7.0.tar.gz.

File metadata

  • Download URL: dbt_schemify-0.7.0.tar.gz
  • Upload date:
  • Size: 30.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for dbt_schemify-0.7.0.tar.gz
Algorithm Hash digest
SHA256 50a1b2d12f7cb3ce4c80e5bd41e84d0eaf0d1afd8d4f5ed17f88885881830401
MD5 63e48dd058414a6a16c3991b2a9c525a
BLAKE2b-256 a9fae4636b9901762a63f5fc06c6a2e35459c3e101ff9e8ac3ac098a6f902789

See more details on using hashes here.

File details

Details for the file dbt_schemify-0.7.0-py3-none-any.whl.

File metadata

  • Download URL: dbt_schemify-0.7.0-py3-none-any.whl
  • Upload date:
  • Size: 24.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for dbt_schemify-0.7.0-py3-none-any.whl
Algorithm Hash digest
SHA256 c01347787b34458ddb9b88df889f7d6bbf9598f8cd19a2a2fc24b9762edabd96
MD5 882f506763d37c575f7da6bf8269ef0f
BLAKE2b-256 5b62b93347c70ef594cfff6dec54aeda5b24ec88e06e0b61553c156fd0c8aec1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page