Catch semantic breaking changes in dbt metrics before they land in production.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

yeaight7

These details have not been verified by PyPI

Project description

dbt-semguard

Catch semantic breaking changes in dbt metrics before they land in production.

dbt-semguard is a CLI-first semantic change detector for dbt Semantic Layer definitions. It compares two versions of the semantic contract, classifies changes as breaking, risky, or safe, and renders local or GitHub-friendly output without requiring warehouse access or dbt runtime internals.

What Is This For?

dbt-semguard is a semantic PR guard for dbt metrics and semantic models.

It answers one question:

What changed in the meaning of this metric?

That matters because many dbt changes are valid from a parser or build point of view, but still dangerous for downstream consumers.

For example, a PR may:

change gross_revenue from sum(order_total) to avg(order_total)
remove a dimension people use to slice a KPI
change a ratio metric denominator
widen or narrow a metric filter
change entity or time-grain semantics

In all of those cases, dbt may still parse successfully and CI may still be green. But the business meaning of the metric has changed, and dashboards, notebooks, reverse ETL jobs, or APIs may silently start returning different answers.

dbt-semguard exists to catch that class of change before it reaches production.

What It Does Exactly

dbt-semguard does not lint YAML style and it does not validate warehouse execution.

Instead, it:

reads the dbt Semantic Layer definition from two inputs
extracts only the semantic parts that affect meaning
builds a canonical contract for each side
diffs those contracts
classifies each change as breaking, risky, or safe
renders the result for local CLI use or GitHub Actions

In practical terms, it helps teams review semantic changes the same way they already review code changes.

How It Works

The tool reduces dbt semantic definitions into a normalized contract that is easier to compare than raw YAML.

It keeps fields that affect meaning, such as:

semantic model identity
backing model name
entities and entity types
dimensions and time granularity
metric type
aggregation and expression
filters
ratio numerator and denominator

It intentionally ignores noise such as:

descriptions
docs blocks
YAML ordering
whitespace and comments

That means the output is focused on semantic drift, not formatting drift.

Install From GitHub

python -m pip install "git+https://github.com/yeaight7/dbt-semguard.git@v0.5.1"

dbt-semguard requires Python 3.11 or newer.

Install From Source

git clone https://github.com/yeaight7/dbt-semguard.git
cd dbt-semguard
python -m pip install .

How To Use It

Run locally before opening a PR

Use this when you want to sanity-check semantic changes while you are still developing:

semguard diff --base-ref main --head-ref HEAD --project-dir .
semguard check --base-ref main --head-ref HEAD --project-dir . --fail-on breaking

Typical use:

diff when you want to inspect what changed
check when you want a blocking exit code for automation or local scripts

For monorepos, always point --project-dir at the dbt project root you want to analyze:

semguard diff --base-ref main --head-ref HEAD --project-dir analytics/dbt

Git ref mode and local YAML mode now both scope discovery to this directory.

Compare exported contracts directly

Use this when you want to compare two precomputed semantic contracts:

semguard diff --base-contract base-contract.json --head-contract head-contract.json --format markdown

Compare manifests explicitly

Use this when your workflow already has dbt semantic_manifest.json artifacts available:

semguard diff --base-manifest base-semantic-manifest.json --head-manifest head-semantic-manifest.json --format json

Extract a contract

Use this when you want a stable machine-readable snapshot of semantic meaning:

semguard extract --source yaml --project-dir examples/ecommerce_dbt_project --output base-contract.json
semguard extract --source manifest --manifest semantic_manifest.json --output manifest-contract.json

Configure YAML discovery with `.semguard.yml`

Create .semguard.yml in your dbt project root to control which YAML files are scanned:

include:
  - models/**/*.yml
  - models/**/*.yaml
  - metrics/**/*.yml
  - metrics/**/*.yaml
  - semantic_models/**/*.yml
  - semantic_models/**/*.yaml
exclude:
  - target/**
  - dbt_packages/**
  - .venv/**
  - .github/**

If the file is not present, these defaults are applied automatically.

Example Review Flow

A developer changes a metric or semantic model in dbt.
dbt-semguard diff compares the base branch and the current branch.
The tool reports semantic changes only.
The team decides whether the change is acceptable, needs migration planning, or should be blocked.
In CI, semguard check --fail-on breaking can fail the PR automatically.

How To Read The Result

breaking: the semantic meaning changed in a way that should usually block by default
risky: the change may be legitimate, but downstream consumers should review it
safe: cosmetic-only changes that do not appear in the semantic diff

Output

diff and check emit one of:

text
markdown
json

JSON reports contain:

summary
highest_severity
blocking
changes
metadata

Example Markdown report

## dbt-semguard report

### Breaking changes
#### Metric `gross_revenue`
- Metric `gross_revenue` changed aggregation from `sum` to `avg`.

Status: blocking

Example JSON report

{
  "summary": {
    "breaking": 3,
    "risky": 1,
    "safe": 0
  },
  "highest_severity": "breaking",
  "blocking": true
}

Coverage

dbt-semguard currently covers the highest-value semantic changes in the latest dbt Semantic Layer spec.

Covered extractors and inputs:

Latest-spec YAML projects
Legacy top-level semantic_models / metrics YAML projects
Explicit dbt semantic_manifest.json input
Canonical contract JSON emitted by semguard extract

Covered semantic comparisons:

Semantic model add/remove and backing model changes
Semantic model default aggregation time dimension changes
Entity add/remove, type changes, and expression changes
Dimension add/remove, type changes, expression changes, and time granularity changes
Simple metric aggregation, expression, label, filter, ownership, aggregation-time, and non-additive changes
Ratio metric numerator and denominator changes
Derived metric expression and input metric changes
Cumulative metric input, window, grain-to-date, and period-aggregation changes
Conversion metric entity, calculation, base metric, conversion metric, and constant-property changes
Additive changes such as new entities, new dimensions, and new metrics

Current automated coverage:

YAML extraction for the latest spec
Manifest normalization
Semantic diff severity mapping for breaking and risky changes
Declarative field-coverage policy so contract fields are explicitly diffed, nested, or intentionally excluded
Source diagnostics in extracted YAML contracts and change reports
CLI extract, diff, and check
Sticky PR comment delivery through the GitHub Action
Checkout-free git ref mode
Pre-release local action smoke coverage in CI, plus post-release published action smoke coverage in both git-ref and manifest modes, including spaced manifest paths

Current Limitations

Known v0.5.1 limitations are intentionally narrow:

There is no fail-on: none advisory-only mode yet.
There is no allowlist for intentional semantic changes yet.
Manifest parsing expects dbt semantic_manifest.json, not the general-purpose dbt manifest.json artifact.
Legacy YAML support covers top-level semantic_models, measures, and type_params, but cross-project ref semantics are still normalized conservatively into the single model_name contract field.
Rename handling is intentionally conservative: a rename is treated as a removal plus an addition.
Source diagnostics are best-effort and currently strongest for YAML extraction; manifest-derived contracts may still lack file/line detail.
GitHub integration supports sticky PR comments for pull_request workflows, but does not yet manage review-thread lifecycles or inline annotations.
PyPI publishing is not available yet; install from GitHub or source instead.

Use As A GitHub Action

Use the included composite action from this repository:

jobs:
  semguard:
    runs-on: ubuntu-latest
    permissions:
      contents: read
      issues: write
      pull-requests: read
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0

      - uses: yeaight7/dbt-semguard@v0.5.1
        id: semguard
        with:
          base-ref: ${{ github.event.pull_request.base.sha }}
          head-ref: ${{ github.sha }}
          fail-on: breaking
          pr-comment: true
          pr-comment-mode: sticky
          github-token: ${{ github.token }}

      - name: Inspect semguard outputs
        run: |
          echo "Highest severity: ${{ steps.semguard.outputs.highest-severity }}"
          echo "Blocking: ${{ steps.semguard.outputs.blocking }}"

The action now exposes structured outputs so downstream CI can branch on semantic severity without reparsing JSON:

steps.semguard.outputs.highest-severity
steps.semguard.outputs.blocking
steps.semguard.outputs.breaking-count
steps.semguard.outputs.risky-count
steps.semguard.outputs.safe-count

pr-comment-mode accepts:

sticky: update the previous dbt-semguard PR comment when one already exists
create: always publish a new PR comment instead of updating the previous one

The action writes:

a Markdown summary to the workflow summary
a JSON artifact named semguard-report
structured step outputs for severity and counts
an optional sticky PR comment when pr-comment: true
a failing status when the configured threshold is reached

When there are zero semantic changes, the Markdown artifact and workflow summary explicitly include No semantic changes detected. followed by Status: passing.

This is the recommended setup when you want the semantic review to happen automatically on every PR.

If you enable pr-comment: true, the workflow needs:

contents: read
issues: write
pull-requests: read

For forked pull requests, the standard pull_request event usually does not get a write-capable GITHUB_TOKEN, so sticky PR comments may be unavailable unless you adopt a separate trusted workflow pattern.

Troubleshooting

Common CI and configuration issues are covered in docs/troubleshooting.md.

Migration notes (`v0.5.1`)

Git ref extraction now scopes strictly to --project-dir for monorepos.
YAML discovery now uses safe default include/exclude patterns.
Optional .semguard.yml include/exclude rules are applied in both local and git-ref YAML extraction.
Invalid semantic YAML now raises user-facing errors with source context instead of raw KeyError tracebacks.
Composite action shell steps now read user-controlled values from environment variables instead of embedding GitHub expressions directly in Bash.
Composite action now generates JSON, Markdown, summary text, and step outputs in a single pass before enforcing the blocking threshold.
Composite action report files now live in an isolated runner temp directory derived from artifact-name, which avoids workspace filename collisions in matrix-style CI jobs.
The repository now documents security reporting, contribution setup, and common action troubleshooting paths.

Example project

An example latest-spec dbt project lives in examples/ecommerce_dbt_project.

Documentation

License

This project is open source under the MIT License. See LICENSE.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

yeaight7

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.5.4

Apr 27, 2026

0.5.3

Apr 26, 2026

0.5.2

Apr 26, 2026

This version

0.5.1

Apr 26, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dbt_semguard-0.5.1.tar.gz (47.7 kB view details)

Uploaded Apr 26, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

dbt_semguard-0.5.1-py3-none-any.whl (26.8 kB view details)

Uploaded Apr 26, 2026 Python 3

File details

Details for the file dbt_semguard-0.5.1.tar.gz.

File metadata

Download URL: dbt_semguard-0.5.1.tar.gz
Upload date: Apr 26, 2026
Size: 47.7 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dbt_semguard-0.5.1.tar.gz
Algorithm	Hash digest
SHA256	`67fd551aba0d2915ac0fc553d33b8271c0434688fb11f2316d5e4144af91bd23`
MD5	`d0bfad1278eb7ab5039e792e1417ad8a`
BLAKE2b-256	`c3143c052f90d70c69b94d8693147af70ebf417c9addd4a045ba52a849030da9`

See more details on using hashes here.

Provenance

The following attestation bundles were made for dbt_semguard-0.5.1.tar.gz:

Publisher: publish.yml on yeaight7/dbt-semguard

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: dbt_semguard-0.5.1.tar.gz
- Subject digest: 67fd551aba0d2915ac0fc553d33b8271c0434688fb11f2316d5e4144af91bd23
- Sigstore transparency entry: 1390125416
- Sigstore integration time: Apr 26, 2026
Source repository:
- Permalink: yeaight7/dbt-semguard@8d3e2196b4929904c7b8c66b09db42973a1399a8
- Branch / Tag: refs/tags/v0.5.2
- Owner: https://github.com/yeaight7
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@8d3e2196b4929904c7b8c66b09db42973a1399a8
- Trigger Event: release

File details

Details for the file dbt_semguard-0.5.1-py3-none-any.whl.

File metadata

Download URL: dbt_semguard-0.5.1-py3-none-any.whl
Upload date: Apr 26, 2026
Size: 26.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dbt_semguard-0.5.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`39970f180ebc98a626f864f71aae8d6ffc09fc23d2fba60d5393b9cf8fdbd70c`
MD5	`055aede1d315040e6dab75ecb2acde94`
BLAKE2b-256	`03887caa5d20889933bf5c58a671d5045a0063b3fe16159b23fc7025e2ccc8fc`

See more details on using hashes here.

Provenance

The following attestation bundles were made for dbt_semguard-0.5.1-py3-none-any.whl:

Publisher: publish.yml on yeaight7/dbt-semguard

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: dbt_semguard-0.5.1-py3-none-any.whl
- Subject digest: 39970f180ebc98a626f864f71aae8d6ffc09fc23d2fba60d5393b9cf8fdbd70c
- Sigstore transparency entry: 1390125587
- Sigstore integration time: Apr 26, 2026
Source repository:
- Permalink: yeaight7/dbt-semguard@8d3e2196b4929904c7b8c66b09db42973a1399a8
- Branch / Tag: refs/tags/v0.5.2
- Owner: https://github.com/yeaight7
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@8d3e2196b4929904c7b8c66b09db42973a1399a8
- Trigger Event: release

dbt-semguard 0.5.1

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

dbt-semguard

What Is This For?

What It Does Exactly

How It Works

Install From GitHub

Install From Source

How To Use It

Run locally before opening a PR

Compare exported contracts directly

Compare manifests explicitly

Extract a contract

Configure YAML discovery with .semguard.yml

Example Review Flow

How To Read The Result

Output

Example Markdown report

Example JSON report

Coverage

Current Limitations

Use As A GitHub Action

Troubleshooting

Migration notes (v0.5.1)

Example project

Documentation

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

Configure YAML discovery with `.semguard.yml`

Migration notes (`v0.5.1`)