Skip to main content

modaryn analyzes dbt projects to score model complexity and structural importance, helping teams identify high-risk and high-impact data models.

Project description

modaryn

modaryn

PyPI version License: MIT Python dbt sqlglot

modaryn analyzes dbt projects to score model complexity and structural importance, helping teams identify high-risk and high-impact data models.

Overview

modaryn is a Python-based CLI tool that analyzes dbt projects and scores each model based on three pillars:

  • Complexity — SQL metrics (JOINs, CTEs, conditionals, WHERE clauses, character count)
  • Importance — Structural metrics (downstream model/column counts)
  • Quality — Test coverage metrics (test count, column coverage %)

Final score: raw_score = complexity_score + importance_score - quality_score (higher = riskier)

The SQL dialect is auto-detected from manifest.json. Column-level lineage is traced via sqlglot to compute downstream column impact.

Installation

pip install modaryn

Usage

score command

Analyzes and scores all dbt models, displaying a combined scan and score report.

modaryn score --project-path . --apply-zscore --format html --output report.html
Option Short Description Default
--project-path -p Path to the dbt project directory .
--dialect -d SQL dialect (bigquery, snowflake, duckdb, etc.). Auto-detected from manifest.json if omitted. auto
--config -c Path to a custom weights YAML file None
--apply-zscore -z Apply Z-score normalization to scores False
--format -f Output format: terminal, markdown, html terminal
--output -o Path to write the output file None
--select -s Filter models by selector (repeatable, OR logic) None
--verbose -v Show detailed warnings (missing SQL, skipped columns) False

--select selector syntax:

# Model name glob
modaryn score --project-path . --select "fct_*"

# Path prefix
modaryn score --project-path . --select path:marts/finance

# dbt tag
modaryn score --project-path . --select tag:daily

# Multiple selectors (OR logic)
modaryn score --project-path . --select path:marts/customer --select path:marts/finance

ci-check command

Checks model scores against a threshold for use in CI/CD pipelines. Exits with code 1 if any model exceeds the threshold, 0 otherwise.

modaryn ci-check --project-path . --threshold 20.0 --apply-zscore
Option Short Description Default
--project-path -p Path to the dbt project directory .
--threshold -t Maximum allowed score (required)
--dialect -d SQL dialect. Auto-detected if omitted. auto
--config -c Path to a custom weights YAML file None
--apply-zscore -z Check against Z-scores instead of raw scores False
--format -f Output format: terminal, markdown, html terminal
--output -o Path to write the output file None
--select -s Filter models by selector (repeatable, OR logic) None
--verbose -v Show detailed warnings False

impact command

Traces all downstream columns affected by a change to a specific column (BFS column-level impact analysis).

modaryn impact --project-path . --model fct_orders --column order_id
Option Short Description Default
--project-path -p Path to the dbt project directory .
--model -m Model name to trace impact from (required)
--column -c Column name to trace impact from (required)
--dialect -d SQL dialect. Auto-detected if omitted. auto
--select -s Filter models by selector (restricts lineage scope) None
--verbose -v Show detailed warnings False

Missing compiled SQL (N/A columns)

Complexity metrics require compiled SQL from target/compiled/. If dbt compile has not been run or a model failed to compile, those columns will show N/A in the report. A warning summary is printed at the end of the output. Use --verbose to see the full list of affected models.

⚠ 3 model(s) show N/A for complexity columns because compiled SQL was not found.
Run `dbt compile` to enable full analysis: model_a, model_b, model_c

Report Columns and Calculation Logic

1. SQL Complexity Metrics

Metric Calculation Example
JOINs Count of all JOIN clauses JOIN, LEFT JOIN, CROSS JOIN each count as 1
CTEs Count of all CTEs defined WITH a AS (...), b AS (...) = 2
Conditionals Count of IF expressions (each WHEN branch in a CASE) A CASE WHEN ... WHEN ... END with 2 branches = 2
WHEREs Count of WHERE clauses including subqueries Main WHERE + subquery WHERE = 2
SQL Chars Total character count of the compiled SQL

2. Structural Importance Metrics

Metric Calculation Example
Downstream Number of dbt models that directly reference this model Models B and C use A → A has 2
Col. Down Total count of downstream column references B's col1 and col2 both reference A's id2

3. Quality Metrics

Metric Calculation Example
Tests Total dbt tests attached to the model 4 column tests → 4
Coverage (%) % of columns with at least one test 8 of 10 columns tested → 80%

Scoring Formula

  1. Complexity Score = (JOINs × w1) + (CTEs × w2) + (Conditionals × w3) + (WHEREs × w4) + (Chars × w5)
  2. Importance Score = (Downstream Models × w6) + (Col. Down × w7)
  3. Quality Score = (Tests × w8) + (Coverage % × w9)

Raw Score = Complexity Score + Importance Score − Quality Score (minimum 0)

Z-Score Normalization

When --apply-zscore is used: Z-Score = (Raw Score − Mean) / Standard Deviation


Custom Weights Configuration

Override default weights by passing a YAML file via --config:

sql_complexity:
  join_count: 2.0
  cte_count: 1.5
  conditional_count: 1.0
  where_count: 0.5
  sql_char_count: 0.01

importance:
  downstream_model_count: 1.0

quality:
  test_count: 0.5
  column_coverage: 1.0

Unknown sections or keys are reported as warnings at runtime.


modaryn

modaryn

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

modaryn-0.1.1.tar.gz (32.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

modaryn-0.1.1-py3-none-any.whl (28.9 kB view details)

Uploaded Python 3

File details

Details for the file modaryn-0.1.1.tar.gz.

File metadata

  • Download URL: modaryn-0.1.1.tar.gz
  • Upload date:
  • Size: 32.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for modaryn-0.1.1.tar.gz
Algorithm Hash digest
SHA256 0aaacbaad1a9ee4eb081fd9b738ea3bdd6cae8f75e71f84d973e016c61f87461
MD5 7f4e6f50c27fa49df0d38b290ca24a22
BLAKE2b-256 9d6cc4efcde6389c5544c969c0c947becc60171ef57e986fbe7519f0be5c5621

See more details on using hashes here.

Provenance

The following attestation bundles were made for modaryn-0.1.1.tar.gz:

Publisher: publish.yml on yujikawa/modaryn

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file modaryn-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: modaryn-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 28.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for modaryn-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 382c28d731b85c8c467fe048f1820fd53a4b0a42d5770c6bbf9635776ea9cd52
MD5 26fe1bd6d63a12b7ae71b6aff8c6e821
BLAKE2b-256 43737743e50a41bcec20072361220ac746b2686f8138dca3da5b861f656c937a

See more details on using hashes here.

Provenance

The following attestation bundles were made for modaryn-0.1.1-py3-none-any.whl:

Publisher: publish.yml on yujikawa/modaryn

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page