SQL inspection CLI: static checks, lineage, PR review, cost diff

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

kraftaa

These details have not been verified by PyPI

Environment
- Console
Intended Audience
- Developers
License
- OSI Approved :: MIT License
Programming Language
- Python :: 3
- Rust
Topic
- Software Development
- Utilities

Project description

querylens

Catch risky analytics SQL before it runs and explain what complex queries actually do.

querylens scans SQL repositories, ranks the worst files first, reviews pull requests for regressions, and highlights reliability and cost risks such as full table scans, wide joins, missing filters, and expensive query patterns before they merge and hit production.

Quick Demo

Input SQL:

SELECT
  c.customer_id,
  SUM(o.amount) AS revenue
FROM customers c
JOIN orders o
  ON c.id = o.customer_id
GROUP BY c.customer_id

Command:

cargo run -- lineage examples/revenue.sql

Output:

examples/revenue.sql
Projections:
revenue
 └─ SUM(orders.amount)
o.customer_id
 └─ orders.customer_id

Install

Prebuilt binary:

curl -L -o querylens.tar.gz https://github.com/kraftaa/querylens/releases/download/v0.1.11/querylens-macos-aarch64.tar.gz
tar -xzf querylens.tar.gz
./querylens --help

Homebrew:

brew install kraftaa/tap/querylens

Build from source:

cargo build
# enable AWS Bedrock provider if needed
# cargo build --features bedrock

Scan estimates (optional)

Athena (post-run calibration): querylens explain --athena-query-execution-id <id> --athena-region us-east-1 --file your.sql uses aws athena get-query-execution to pull DataScannedInBytes from a past run. Use this to compare a real scan to our estimate or as the “before” value in PR cost diff. Not for brand-new queries.
Manual override: --scan-tb 1.5 or --scan-bytes 1500000000000.
Table stats file (offline, no cloud calls): --stats-file stats.json

Example stats.json:

{
  "tables": {
    "athena.sales": {
      "total_bytes": 1200000000000,
      "row_count": 3500000000,
      "partition_columns": ["ds"],
      "partitions_per_year": 365
    },
    "athena.customers": {
      "total_bytes": 8000000000
    }
  }
}

Heuristics:

No WHERE → assume full scan.
Filters on non-partition columns → ~0.7 of table size.
Equality/range on partition column → small slice (2–5% by default, or days/partitions if provided).
Sum per-table bytes to show “Estimated scan” and PR cost deltas.

Features

Feature	Description
Static SQL checks	Detect risky patterns (`SELECT *`, possible Cartesian joins, wildcard `LIKE`, etc.)
Column lineage	Trace projection, filter, and join lineage
Query explanation	Summarize tables, joins, aggregations, and likely query meaning
Table extraction	List tables used by a query
Folder scanning	Analyze a directory of SQL files
Rule controls	Disable rules or override severity by `rule_id`
Athena mode	Extra heuristics for partition/cost patterns
CI thresholds	Fail on `low

Why This Exists

SQL pipelines grow quickly and are hard to review.

When a metric looks wrong, teams need to:

trace where output columns come from
catch risky query patterns before they hit production
understand query intent quickly

querylens helps with deterministic checks and optional LLM explanations.

Detect Risky SQL Patterns

Example query:

SELECT *
FROM orders o
JOIN customers c

Command:

cargo run -- --file examples/bad_join.sql --static-only

Expected findings include:

SELECT *
possible Cartesian join (JOIN without ON/USING)

Example subquery pattern:

cargo run -- --file examples/subquery.sql --static-only

Expected suggestion includes:

consider replacing IN (SELECT ...) with JOIN or EXISTS

How Is This Different From dbt?

dbt builds and runs transformation pipelines.

querylens analyzes SQL itself:

detect risky query patterns
trace lineage in query text
explain query logic

They complement each other: dbt for orchestration/modeling, querylens for query inspection.

Distribution

GitHub Releases (prebuilt binaries)

Tagging v* triggers .github/workflows/release.yml and publishes:

querylens-macos-aarch64.tar.gz
querylens-linux-x86_64.tar.gz
SHA256SUMS

Create a release tag:

git tag v0.1.2
git push origin v0.1.2

Homebrew tap

Use the formula template at:

packaging/homebrew/querylens.rb

For each release:

Set version (without v).
Fill __SHA256_MACOS_AARCH64__ and __SHA256_LINUX_X86_64__ from SHA256SUMS.
Commit the formula in your tap repo (for example kraftaa/homebrew-tap) as Formula/querylens.rb.
Users install with:

brew install kraftaa/tap/querylens

Usage

Note: all analysis commands are static-only; they never execute your SQL against a database.

Subcommands

cargo run -- lineage <file.sql>
cargo run -- lineage <file.sql> --column revenue
cargo run -- risk <file.sql>
cargo run -- risk <file.sql> --summary-only
cargo run -- guard <file.sql> --max-risk high --deny-rule CROSS_JOIN
cargo run -- simulate <file.sql> --limit 100
cargo run -- tables <file.sql>
cargo run -- explain <file.sql>
cargo run -- analyze <dir> --glob "*.sql"
cargo run -- analyze <dir> --glob "*.sql" --changed-only
cargo run -- analyze <dir> --glob "*.sql" --changed-only --changed-base main
cargo run -- analyze <dir> --glob "*.sql" --top 10
cargo run -- analyze <dir> --glob "*.sql" --top 10 --verbose
cargo run -- pr-review --base main --head HEAD --dir models --glob "*.sql"
cargo run -- pr-review --base main --head HEAD --dir models --glob "*.sql" --ci
cargo run -- pr-review --base main --head HEAD --dir models --glob "*.sql" --markdown
cargo run -- pr-review --base main --head HEAD --dir models --glob "*.sql" --cost-diff --stats-file stats.json

PR review mode

cargo run -- pr-review --base main --head HEAD --dir models --glob "*.sql"

Example output:

SQL Inspect PR Review
Base: main
Head: HEAD

PR status: PASS

No new SQL risk regressions detected.

1 changed SQL file
0 new HIGH-risk queries
0 query lost partition filter
0 ORDER BY without LIMIT regressions
0 possible join amplification regressions
0 files increased estimated scan cost

File: models/example.sql
Previous risk: HIGH
Current risk: HIGH
Risk trend: unchanged
Still risky because:
- SELECT_STAR
Estimated scan: unknown -> unknown
Estimated scan delta: unknown

Compact CI mode:

cargo run -- pr-review --base main --head HEAD --dir models --glob "*.sql" --ci

PR status: PASS
Changed SQL files: 1
New HIGH-risk queries: 0
Partition filter regressions: 0
ORDER BY without LIMIT regressions: 0
Join amplification regressions: 0
Files with higher estimated scan: 0

Markdown mode for PR comments:

cargo run -- pr-review --base main --head HEAD --dir models --glob "*.sql" --markdown

Cost regression mode:

cargo run -- pr-review --base main --head HEAD --dir models --glob "*.sql" --cost-diff --stats-file stats.json

Example output:

SQL Cost Regression

1 changed SQL file

File: models/revenue.sql

Estimated scan change:
Before: 22 GB
After: 1.40 TB
Increase: 63.6x

Cost regression: HIGH

Reason:
Filter removed: orders.order_date >= DATE '2026-01-01'

Recommendation:
Restore a selective WHERE or partition predicate.

Repo scan summary

analyze now highlights severity shape and hotspots, not only rule counts.

Useful flags:

--top <N> number of hotspot files to display (default 5)
--changed-only limit to changed files in working tree/staging
--changed-base <ref> (with --changed-only) limit to committed diff vs ref (for example main)

Example:

SQL Inspect Report
Scope: current selection

Analyzed 225 SQL files

Top risks:
1. 37 HIGH-risk files
2. 117 files likely scan full tables
3. 18 files have complex multi-join patterns
4. 12 files contain CROSS JOIN or likely Cartesian behavior

Severity shape:
HIGH: 37 files
MEDIUM: 88 files
LOW: 100 files

Most severe files:
- models/a.sql  HIGH  SELECT_STAR, FULL_TABLE_SCAN_LIKELY
- models/b.sql  HIGH  WIDE_JOIN_GRAPH, MISSING_WHERE

Verbose mode:

cargo run -- analyze . --glob "*.sql" --top 10 --verbose

Query Risk Scanner

cargo run -- risk examples/bad_join.sql
cargo run -- risk examples/bad_join.sql --summary-only
cargo run -- risk examples/bad_join.sql --scan-tb 2.3
cargo run -- risk examples/bad_join.sql --scan-bytes 2300000000000

Example output:

examples/bad_join.sql
Risk score: HIGH

Reasons:
- select star: SELECT *
- missing where: No WHERE clause

Estimated scan: 2.30 TB

Compact risk summary:

SQL Inspect Risk
File: examples/bad_join.sql
Risk: HIGH
Estimated scan: unknown
Top reasons:
- select star: SELECT *
- missing where: No WHERE clause
- full table scan likely: Likely full table scan

Use one of:

--scan-tb <value> for TB units directly
--scan-bytes <value> to auto-convert bytes to TB
--athena-query-execution-id <id> to fetch DataScannedInBytes from Athena via aws CLI
--athena-region <region> optional region override for Athena lookup
--stats-file <path.json> to estimate scan bytes from table-level stats

Example stats.json:

{
  "tables": {
    "orders": { "bytes": 1000000000000 },
    "customers": 200000000000
  }
}

Query explanation

cargo run -- explain examples/query.sql

Example output:

Query explanation
Meaning: total amount per id, created at, email
Tables: orders, customers, order_items
Join: customers.id = orders.customer_id; order_items.order_id = orders.id
Aggregation: SUM(order_items.quantity * order_items.unit_price) AS total_amount

Block dangerous queries

cargo run -- guard examples/bad_join.sql --max-risk high --deny-rule CROSS_JOIN --deny-rule FULL_TABLE_SCAN_LIKELY

Example output:

SQL Inspect Guard
Policy: default

Status: FAIL
Risk: HIGH

Blocking violations
- FULL_TABLE_SCAN_LIKELY
- SELECT_STAR

Why blocked
This query shape is likely to scan most rows and can materially increase cost

Exit code: 2

Exit code is 2 when blocked, so this works directly in CI.

Simulate safer preview query

cargo run -- simulate examples/query.sql --limit 100

Column-specific lineage

cargo run -- lineage examples/revenue.sql --column revenue

Example output:

examples/revenue.sql
Projections:
revenue
 └─ SUM(orders.amount)

Main Analyze Command (LLM + static)

Provide one of:

--sql "<query>"
--file <path>
--dir <path>

Optional:

--provider openai|bedrock|local
--dialect generic|athena
--static-only
--fail-on low|medium|high
--glob "*.sql"
--config querylens.toml
--json

OpenAI example:

export OPENAI_API_KEY="..."
export OPENAI_MODEL="gpt-4.1-mini"
cargo run -- --provider openai --file examples/query.sql

Bedrock example:

export AWS_REGION="us-east-1"
export BEDROCK_MODEL_ID="anthropic.claude-3-5-sonnet-20241022-v2:0"
cargo run -- --provider bedrock --file examples/query.sql

Local OpenAI-compatible server example:

export LOCAL_LLM_BASE_URL="http://127.0.0.1:8080"
export LOCAL_LLM_MODEL="llama_instruct.gguf"
cargo run -- --provider local --file examples/query.sql --json

Config

Create querylens.toml:

dialect = "athena"
fail_on = "high"
glob = "*.sql"
suggest_limit_for_exploratory = true
static_only = false

[rules.SELECT_STAR]
enabled = true
severity = "high"

[rules.MISSING_WHERE]
enabled = true
severity = "medium"

Rule controls:

enabled = false disables a finding by rule_id
severity = "low|medium|high" overrides severity

Example:

[rules.SELECT_STAR]
enabled = false

[rules.MISSING_WHERE]
severity = "low"

CI Usage

Fail a build when risky SQL is found:

cargo run -- --dir models --dialect athena --fail-on high

Or subcommand mode:

cargo run -- analyze models --glob "*.sql"
cargo run -- analyze models --glob "*.sql" --changed-only

Inline rule suppression

Add suppression comments directly in SQL when needed:

-- querylens: disable=SELECT_STAR,MISSING_WHERE
SELECT * FROM some_small_reference_table

Suppression applies to matching rule_id values for that file.

Examples Folder

Ready-to-run examples:

examples/query.sql
examples/revenue.sql
examples/bad_join.sql
examples/subquery.sql
examples/silver_proposal_attachments.sql

Project Layout

querylens/
  Cargo.toml
  src/
    main.rs
    analyzer.rs
    insights.rs
    config.rs
    prompt.rs
    providers/
      openai.rs
      bedrock.rs
      local.rs
  examples/

Troubleshooting

missing env vars: set required provider vars
unexpected model JSON shape: run with --json and inspect response
no secrets in repo: .env and .env.* are gitignored

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

kraftaa

These details have not been verified by PyPI

Environment
- Console
Intended Audience
- Developers
License
- OSI Approved :: MIT License
Programming Language
- Python :: 3
- Rust
Topic
- Software Development
- Utilities

Release history Release notifications | RSS feed

0.1.15

Mar 23, 2026

This version

0.1.14

Mar 23, 2026

0.1.13

Mar 23, 2026

0.1.12

Mar 22, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

querylens-0.1.14-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.4 MB view details)

Uploaded Mar 23, 2026 Python 3manylinux: glibc 2.17+ x86-64

querylens-0.1.14-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (2.3 MB view details)

Uploaded Mar 23, 2026 Python 3manylinux: glibc 2.17+ ARM64

querylens-0.1.14-py3-none-macosx_11_0_arm64.whl (2.2 MB view details)

Uploaded Mar 23, 2026 Python 3macOS 11.0+ ARM64

querylens-0.1.14-py3-none-macosx_10_12_x86_64.whl (2.3 MB view details)

Uploaded Mar 23, 2026 Python 3macOS 10.12+ x86-64

File details

Details for the file querylens-0.1.14-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

Download URL: querylens-0.1.14-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Upload date: Mar 23, 2026
Size: 2.4 MB
Tags: Python 3, manylinux: glibc 2.17+ x86-64
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for querylens-0.1.14-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm	Hash digest
SHA256	`301c28aa30f9a8605e720dd5f8438dfba80ba6a36c4987e97b74e9dd72ec6db1`
MD5	`d626d86c77c99a967313c8d65b9b3c86`
BLAKE2b-256	`f1e1c023caec562757c2b07769a9c4612ea69a93fe119954d82df8e21d8ff3b8`

See more details on using hashes here.

Provenance

The following attestation bundles were made for querylens-0.1.14-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: release.yml on kraftaa/querylens

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: querylens-0.1.14-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Subject digest: 301c28aa30f9a8605e720dd5f8438dfba80ba6a36c4987e97b74e9dd72ec6db1
- Sigstore transparency entry: 1162868827
- Sigstore integration time: Mar 23, 2026
Source repository:
- Permalink: kraftaa/querylens@eb7278908efd2e4da128f797e621c117bfeafa80
- Branch / Tag: refs/tags/v0.1.14
- Owner: https://github.com/kraftaa
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@eb7278908efd2e4da128f797e621c117bfeafa80
- Trigger Event: push

File details

Details for the file querylens-0.1.14-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

Download URL: querylens-0.1.14-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Upload date: Mar 23, 2026
Size: 2.3 MB
Tags: Python 3, manylinux: glibc 2.17+ ARM64
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for querylens-0.1.14-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm	Hash digest
SHA256	`92c775ec22f358b81b14ff668b61297522afbe17d31c91f3b504e2a7af2302a7`
MD5	`c2876926a097320892ca76faf2b0a237`
BLAKE2b-256	`6a76f2eb779d1ac62a717c4943e74dc87f85b78ac583bbbeebeed8af98e016d4`

See more details on using hashes here.

Provenance

The following attestation bundles were made for querylens-0.1.14-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl:

Publisher: release.yml on kraftaa/querylens

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: querylens-0.1.14-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
- Subject digest: 92c775ec22f358b81b14ff668b61297522afbe17d31c91f3b504e2a7af2302a7
- Sigstore transparency entry: 1162868973
- Sigstore integration time: Mar 23, 2026
Source repository:
- Permalink: kraftaa/querylens@eb7278908efd2e4da128f797e621c117bfeafa80
- Branch / Tag: refs/tags/v0.1.14
- Owner: https://github.com/kraftaa
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@eb7278908efd2e4da128f797e621c117bfeafa80
- Trigger Event: push

File details

Details for the file querylens-0.1.14-py3-none-macosx_11_0_arm64.whl.

File metadata

Download URL: querylens-0.1.14-py3-none-macosx_11_0_arm64.whl
Upload date: Mar 23, 2026
Size: 2.2 MB
Tags: Python 3, macOS 11.0+ ARM64
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for querylens-0.1.14-py3-none-macosx_11_0_arm64.whl
Algorithm	Hash digest
SHA256	`9d5baff6e0673b997cc9f62d62022b1e7f75e532a045ac3ead9d66aa0898fcd9`
MD5	`36753ce35d8b166c69ab43e22e818c90`
BLAKE2b-256	`7604ef81263f000598dd58c4bfc021e13ab36e2490d21cac9d0fa3804277d6be`

See more details on using hashes here.

Provenance

The following attestation bundles were made for querylens-0.1.14-py3-none-macosx_11_0_arm64.whl:

Publisher: release.yml on kraftaa/querylens

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: querylens-0.1.14-py3-none-macosx_11_0_arm64.whl
- Subject digest: 9d5baff6e0673b997cc9f62d62022b1e7f75e532a045ac3ead9d66aa0898fcd9
- Sigstore transparency entry: 1162868899
- Sigstore integration time: Mar 23, 2026
Source repository:
- Permalink: kraftaa/querylens@eb7278908efd2e4da128f797e621c117bfeafa80
- Branch / Tag: refs/tags/v0.1.14
- Owner: https://github.com/kraftaa
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@eb7278908efd2e4da128f797e621c117bfeafa80
- Trigger Event: push

File details

Details for the file querylens-0.1.14-py3-none-macosx_10_12_x86_64.whl.

File metadata

Download URL: querylens-0.1.14-py3-none-macosx_10_12_x86_64.whl
Upload date: Mar 23, 2026
Size: 2.3 MB
Tags: Python 3, macOS 10.12+ x86-64
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for querylens-0.1.14-py3-none-macosx_10_12_x86_64.whl
Algorithm	Hash digest
SHA256	`56c67e070ab70a3d2e449c82d0d75a46b2756127b4379dd1e09b24968e6f8634`
MD5	`756bb6c11bd8303145b57d277cde3726`
BLAKE2b-256	`c9d65be87a46d5c74bf8f050ae89ff18585011f2b184f78a8633d2e8bbf56114`

See more details on using hashes here.

Provenance

The following attestation bundles were made for querylens-0.1.14-py3-none-macosx_10_12_x86_64.whl:

Publisher: release.yml on kraftaa/querylens

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: querylens-0.1.14-py3-none-macosx_10_12_x86_64.whl
- Subject digest: 56c67e070ab70a3d2e449c82d0d75a46b2756127b4379dd1e09b24968e6f8634
- Sigstore transparency entry: 1162868737
- Sigstore integration time: Mar 23, 2026
Source repository:
- Permalink: kraftaa/querylens@eb7278908efd2e4da128f797e621c117bfeafa80
- Branch / Tag: refs/tags/v0.1.14
- Owner: https://github.com/kraftaa
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@eb7278908efd2e4da128f797e621c117bfeafa80
- Trigger Event: push

querylens 0.1.14

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

querylens

Quick Demo

Install

Features

Why This Exists

Detect Risky SQL Patterns

How Is This Different From dbt?

Distribution

GitHub Releases (prebuilt binaries)

Homebrew tap

Usage

Subcommands

PR review mode

Repo scan summary

Query Risk Scanner

Query explanation

Block dangerous queries

Simulate safer preview query

Column-specific lineage

Main Analyze Command (LLM + static)

Config

CI Usage

Inline rule suppression

Examples Folder

Project Layout

Troubleshooting

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distributions

Built Distributions

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance