A pydoclint-style metadata-quality linter for VGI workers.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

query-farm

These details have not been verified by PyPI

Project links

Homepage

Project description

Vector Gateway Interface

vgi-lint

A pydoclint-style metadata-quality linter for VGI workers. It attaches to an arbitrary VGI worker, reads everything the worker contributes through DuckDB system tables, and reports quality findings — missing descriptions, undocumented columns/functions, absent or malformed example queries, untagged objects, and more — with a quality score, per-data-version baselines, and machine output for coding agents.

It works with any VGI worker regardless of implementation language (Python, Go, Rust, Java, TypeScript, …): it treats the worker as a black box and inspects only what surfaces post-attach.

Install / run

uv sync                      # haybarn is RC-only; prerelease = "allow" is set
uv run vgi-lint --help

Quick start

# Lint a local subprocess worker
uv run vgi-lint 'uv run volcano_worker.py'

# Lint a no-auth HTTP worker
uv run vgi-lint http://localhost:9009

# Machine output for a coding agent / CI
uv run vgi-lint http://localhost:9009 --format agent
uv run vgi-lint http://localhost:9009 --format json

In a worker's own repo, add a [tool.vgi-lint-check] block (see vgi-lint init) with a location, then just run vgi-lint with no arguments.

v1 supports local subprocess and no-auth HTTP workers. Authenticated (OAuth) workers are not yet supported.

What it checks

Object coverage: the catalog itself, schemas, tables, views, columns, scalar/aggregate functions, macros, settings, pragmas, and constraints. Rule families:

Family	Codes	Examples
Catalog	VGI0xx	catalog description, `vgi.description_llm`/`_md`, `source_url` (the worker's "listing")
Descriptions	VGI1xx	schema/table/view comment, `vgi.description_llm`, `vgi.description_md`
Discoverability	VGI12x	duplicate/short/echoed descriptions, release freshness, example richness, units (opt-in)
Content	VGI17x	`vgi.description_md` is valid Markdown; description links/images & source URLs resolve (no 404)
Columns	VGI2xx	column-comment coverage (tables and views), comment-not-echo
Functions	VGI3xx	description (+ quality), documented parameters, named arguments, examples
Tags	VGI4xx	required tag keys (opt-in), reserved-tag validity
Examples	VGI5xx	`vgi.example_queries` present, valid JSON, complete entries, catalog-qualified
Settings	VGI6xx	setting descriptions
Pragmas	VGI7xx	pragma descriptions
Constraints	VGI8xx	foreign-key/PK/check validity — references must point at real tables & columns
Structure	VGI11x	schema object-count cap (opt-in)
Execution	VGI9xx	example queries & CHECK constraints bind/execute (opt-in, `--execute`)

See RULES.md for the full per-rule reference (codes, default severities, and what each checks). Run vgi-lint rules to list them from your installed version, or vgi-lint explain VGI112 for one.

Link checking is on by default (VGI171): URLs and images in descriptions, and source_url/vgi.source_url repo links, are resolved over HTTP and flagged if they 404. Only definitive client errors (4xx) are reported — timeouts, DNS failures, 5xx, and access-gated codes are skipped so CI isn't flaky. Disable with --no-check-links (or run fully offline).

Reserved tags

VGI workers attach metadata via tags; vgi-lint recognizes these reserved keys (set them on the catalog, a schema, a table/view, or — where noted — a function):

Tag	Purpose
`vgi.description_llm`	Concise description aimed at LLMs/agents (tool selection)
`vgi.description_md`	Markdown description for human docs / listing pages
`vgi.example_queries`	JSON list of `{"description","sql"}` example queries
`vgi.title`	Human/marketing display name (vs. the machine name)
`vgi.keywords`	Comma-separated search keywords / synonyms
`vgi.source_url`	Link to where the object is implemented (repo/file)
`vgi.author`	Author / maintainer attribution (catalog)
`vgi.copyright`	Copyright notice (catalog)
`vgi.license`	License name or SPDX identifier (catalog)

vgi.description_llm/_md are required on the catalog and every schema (the catalog is the worker's listing; schemas are its sections). They're optional on tables, views, and functions (opt-in to require, but validated when set — e.g. minimum length, must differ). The catalog source_url is required; titles, keywords, and per-object source links are opt-in but validated when set; author/copyright/license are encouraged (info). Tune any of this via config.

Data versions

A VGI worker can publish multiple data versions whose metadata differs. The tool can lint one or all of them and compare quality across versions:

uv run vgi-lint versions <location>            # list published versions
uv run vgi-lint <location> --data-version 2.0.0
uv run vgi-lint <location> --all-data-versions # per-version report + comparison

Baselines (grandfathering)

Adopt the linter on an existing worker without a wall of failures: record current findings as a baseline, then fail CI only on new findings. Baselines are per data version (<prefix>.<version>.json).

uv run vgi-lint <location> --baseline vgi-lint-baseline --update-baseline
uv run vgi-lint <location> --baseline vgi-lint-baseline --fail-on warning

Configuration

[tool.vgi-lint-check] in pyproject.toml (or a dedicated vgi-lint.toml):

[tool.vgi-lint-check]
location = "uv run worker.py"
select = ["ALL"]
ignore = ["VGI113"]
fail_on = "error"

[tool.vgi-lint-check.severity]
VGI201 = "error"

[tool.vgi-lint-check.options]
column_comment_min_ratio = 0.8
# Required tags are opt-in (empty by default) — set them if your workers have a
# tagging convention you want enforced:
# required_schema_tags = ["provider", "domain"]

[tool.vgi-lint-check.per-object]
"volcanos.hans.*" = { ignore = ["VGI112"] }

Precedence: defaults < pyproject.toml < vgi-lint.toml < CLI flags.

Exit codes

0 clean (or below --fail-on) · 1 config/tool error · 2 findings ≥ --fail-on (regressions only when a baseline is set) · 3 connection error.

Security / trust boundary

A subprocess LOCATION is executed as a command to launch the worker (the vgi extension spawns it). Treat location like any shell command: never pass an attacker-controlled value, and in CI never derive it from untrusted input (e.g. a fork PR title/branch). Prefer a fixed path or HTTP URL you control.

GitHub Action (reusable)

This repo ships a composite action so a worker repo can lint itself in CI with a single step — it installs uv, runs the linter (the signed vgi community extension is installed automatically), gates on fail-on, and posts the findings to the job summary. Build the worker first, then point the action at it:

# .github/workflows/ci.yml — inside a job that has already built the worker
      - name: VGI metadata quality
        uses: Query-farm/vgi-lint-check@v1
        with:
          location: "$PWD/target/release/units-worker"   # binary, command, or HTTP URL
          fail-on: warning                                 # info | warning | error | never

Gate releases harder than everyday CI — e.g. fail-on: warning on push/PR while the worker's quality is being raised, and fail-on: error (plus execute: true) in the publish workflow:

      - uses: Query-farm/vgi-lint-check@v1
        with:
          location: "$PWD/target/release/units-worker"
          fail-on: error
          execute: true        # also run example queries / CHECK constraints (VGI9xx)

Key inputs: location (required), fail-on (default error), version (pin the linter, e.g. 0.2.0), working-directory, data-version / all-data-versions, baseline, execute, format (terminal|json|agent|jsonl), config, args, summary. The action's exit-code is exposed as an output. The action ref @v1 tracks the latest v1.x of the action; pin to a tag or SHA for full reproducibility.

Development

uv run pytest               # unit tests (offline)
uv run pytest --run-live    # also run live tests against real workers
uv build                    # build sdist + wheel into dist/

Releasing (GitHub Actions → PyPI)

Publishing is automated via GitHub Actions using PyPI Trusted Publishing (OIDC — no API token secret to store):

.github/workflows/ci.yml runs the offline test suite (Python 3.11–3.13) and a smoke build on every push/PR.
.github/workflows/publish.yml builds, validates (twine check), and uploads to PyPI when a GitHub Release is published. It first checks that the release tag matches the version in pyproject.toml.

One-time setup on PyPI (Trusted Publisher), under the project's Publishing settings (use a "pending publisher" before the first release):

Field	Value
Owner	`Query-farm`
Repository	`vgi-lint-check`
Workflow	`publish.yml`
Environment	`pypi`

Also create a GitHub Environment named pypi in the repo settings (it gates the publish job and is referenced for the OIDC claim).

To cut a release:

# bump version in pyproject.toml, commit, then tag + create the release
git tag v0.1.0 && git push origin v0.1.0
gh release create v0.1.0 --generate-notes

The release publishing event triggers the workflow. (Prefer a token instead of OIDC? Replace the publish job's trusted-publishing step with pypa/gh-action-pypi-publish configured with password: ${{ secrets.PYPI_API_TOKEN }} and add that repository secret.)

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

query-farm

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

0.48.0

Jun 26, 2026

0.47.0

Jun 26, 2026

0.46.0

Jun 26, 2026

0.45.0

Jun 25, 2026

0.44.0

Jun 25, 2026

0.43.0

Jun 25, 2026

0.42.0

Jun 25, 2026

0.41.0

Jun 25, 2026

0.40.0

Jun 25, 2026

0.39.0

Jun 25, 2026

0.38.0

Jun 25, 2026

0.37.0

Jun 25, 2026

0.36.0

Jun 25, 2026

0.35.0

Jun 25, 2026

0.34.0

Jun 25, 2026

0.33.0

Jun 25, 2026

0.32.0

Jun 25, 2026

0.31.0

Jun 25, 2026

0.30.0

Jun 25, 2026

0.29.0

Jun 25, 2026

0.28.0

Jun 25, 2026

0.27.0

Jun 25, 2026

0.26.0

Jun 24, 2026

0.25.0

Jun 24, 2026

0.24.0

Jun 24, 2026

0.23.0

Jun 24, 2026

0.22.0

Jun 24, 2026

0.21.0

Jun 24, 2026

0.20.0

Jun 24, 2026

0.19.0

Jun 24, 2026

0.18.0

Jun 24, 2026

0.17.0

Jun 24, 2026

0.16.0

Jun 24, 2026

0.15.0

Jun 24, 2026

0.14.0

Jun 24, 2026

0.13.0

Jun 24, 2026

0.12.0

Jun 24, 2026

0.11.0

Jun 24, 2026

0.10.0

Jun 24, 2026

0.9.0

Jun 24, 2026

0.8.0

Jun 24, 2026

0.7.0

Jun 24, 2026

This version

0.6.0

Jun 24, 2026

0.5.2

Jun 24, 2026

0.5.1

Jun 24, 2026

0.5.0

Jun 24, 2026

0.4.0

Jun 24, 2026

0.3.0

Jun 24, 2026

0.2.1

Jun 24, 2026

0.2.0

Jun 24, 2026

0.1.0

Jun 24, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vgi_lint_check-0.6.0.tar.gz (53.6 kB view details)

Uploaded Jun 24, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

vgi_lint_check-0.6.0-py3-none-any.whl (73.0 kB view details)

Uploaded Jun 24, 2026 Python 3

File details

Details for the file vgi_lint_check-0.6.0.tar.gz.

File metadata

Download URL: vgi_lint_check-0.6.0.tar.gz
Upload date: Jun 24, 2026
Size: 53.6 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for vgi_lint_check-0.6.0.tar.gz
Algorithm	Hash digest
SHA256	`6da9ebf098ccf9f980d5519492c312d2905c9458c293e14a6e97b2dc36f0808c`
MD5	`c8181f37d834542d6ed0cf8a910051d6`
BLAKE2b-256	`b35bca1c06afdc4f3e116f0c9455fcaaf458802e450ecf9a7adddc7be78208f5`

See more details on using hashes here.

Provenance

The following attestation bundles were made for vgi_lint_check-0.6.0.tar.gz:

Publisher: publish.yml on Query-farm/vgi-lint-check

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: vgi_lint_check-0.6.0.tar.gz
- Subject digest: 6da9ebf098ccf9f980d5519492c312d2905c9458c293e14a6e97b2dc36f0808c
- Sigstore transparency entry: 1939766264
- Sigstore integration time: Jun 24, 2026
Source repository:
- Permalink: Query-farm/vgi-lint-check@bb37d70a57a6e54d81f60fe4e1b5ab3926d83fa5
- Branch / Tag: refs/tags/v0.6.0
- Owner: https://github.com/Query-farm
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@bb37d70a57a6e54d81f60fe4e1b5ab3926d83fa5
- Trigger Event: release

File details

Details for the file vgi_lint_check-0.6.0-py3-none-any.whl.

File metadata

Download URL: vgi_lint_check-0.6.0-py3-none-any.whl
Upload date: Jun 24, 2026
Size: 73.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for vgi_lint_check-0.6.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`3733ac2ed5e122361d11d07c73178a6cb85498e435ed317908c6d2503439e73d`
MD5	`0e613f6c45f32a0a8dcfa822e2306e4c`
BLAKE2b-256	`cf72537bdd04da61f104bcec9044dfa68b8f979e1887476743fcbd7268d89387`

See more details on using hashes here.

Provenance

The following attestation bundles were made for vgi_lint_check-0.6.0-py3-none-any.whl:

Publisher: publish.yml on Query-farm/vgi-lint-check

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: vgi_lint_check-0.6.0-py3-none-any.whl
- Subject digest: 3733ac2ed5e122361d11d07c73178a6cb85498e435ed317908c6d2503439e73d
- Sigstore transparency entry: 1939766345
- Sigstore integration time: Jun 24, 2026
Source repository:
- Permalink: Query-farm/vgi-lint-check@bb37d70a57a6e54d81f60fe4e1b5ab3926d83fa5
- Branch / Tag: refs/tags/v0.6.0
- Owner: https://github.com/Query-farm
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@bb37d70a57a6e54d81f60fe4e1b5ab3926d83fa5
- Trigger Event: release

vgi-lint-check 0.6.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

vgi-lint

Install / run

Quick start

What it checks

Reserved tags

Data versions

Baselines (grandfathering)

Configuration

Exit codes

Security / trust boundary

GitHub Action (reusable)

Development

Releasing (GitHub Actions → PyPI)

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance