Schema introspection and resolution toolkit for SQL agents
Project description
schemadex
Schema introspection and resolution toolkit for SQL agents.
schemadex is a Rust core with Python bindings that turns a live database into something an LLM agent can actually consume: a token-budgeted, sample-aware, fuzzy-resolvable view of every table and column, cached on disk so the agent doesn't re-pay the introspection cost on every turn.
Install
pip install schemadex # core + all backends
pip install "schemadex[langchain,langgraph]" # with framework adapters
Rust (git, not crates.io):
schemadex-core = { git = "https://github.com/AmiyaMandal1/schemadex", features = ["postgres", "sqlite", "duckdb_backend"] }
Documentation
Full docs: https://amiyamandal1.github.io/schemadex/
10-line example
from schemadex import SchemaCache
cache = SchemaCache.from_url("postgres://localhost/mydb")
for name in cache.list_tables():
print(name)
result = cache.resolve("public.orders", "customer_idd")
print(result.matched, result.confidence) # 'customer_id', 0.98
prompt, tokens = cache.describe_for_agent(max_tokens=1500, hint="orders by region")
Benchmarks
Synthetic adversarial corpus (benches/agent-success/): 38 column-name hallucinations modeled after real LLM SQL agent failures — case flips, missing/added underscores, plural/singular drift, semantic near-misses.
Two harnesses, same corpus:
| Harness | Baseline | Treatment | Δ |
|---|---|---|---|
| Literal lookup (no LLM) | 0.0% | 94.7% | +94.7 pp |
qwen2.5-coder:3b via Ollama (real LLM) |
97.4% | 100.0% | +2.6 pp |
Same model in both halves of the LLM row — the difference is describe_for_agent ranking + resolve_column post-correction. The single miss in the LLM baseline is "What is the orderid on shipments?" — the model picked id instead of order_id; resolve_column fixed it.
Median per-record latency: literal harness ~7 µs / 3 µs (schemadex overhead is in the noise); LLM harness ~285 ms (LLM-bound on an M2 Max).
Reproduce:
# Literal (no LLM)
python benches/agent-success/synthetic_corpus.py
python benches/agent-success/run_synthetic.py
# Real LLM via Ollama
ollama serve &
ollama pull qwen2.5-coder:3b
python benches/agent-success/run_ollama.py --model qwen2.5-coder:3b --n 0
This is a micro-benchmark of the resolution path, not an end-to-end LLM agent comparison on BIRD/Spider — those harnesses are scaffolded in baseline.py / treatment.py but require an API key + corpus download. See docs/benchmark.md for methodology.
Why
LLM SQL agents fail in the same three ways over and over:
- They hallucinate column names because they got the schema in a thousand-token blob and forgot half of it.
- They retry the same broken query because they don't know
'No Delay'is the sentinel value covering 80% ofdelay_code. - They re-introspect on every step because there's nowhere obvious to cache the schema.
schemadex fixes all three:
- Resolution:
resolve_column(table, candidate)returns a confidence + alternatives instead of letting the agent guess. - Sampling:
sample_values=Truecollects top-K, percentiles, and flags any value over 40% frequency as a sentinel. - Cache: introspect once, persist to
~/.cache/schemadex/<db>/, refresh on DDL change. On a local 50-table SQLite, warm reads are ~47× cold (cargo bench --bench cache_refresh); on remote Postgres the ratio grows since cold is network-bound.
Backends
| Backend | Feature flag | Status | Credentials |
|---|---|---|---|
| Postgres | postgres |
✅ | URL only |
| SQLite | sqlite |
✅ | URL only |
| DuckDB | duckdb_backend |
✅ | URL only |
| MySQL | mysql |
✅ | URL only |
| BigQuery | bigquery |
✅ | GOOGLE_APPLICATION_CREDENTIALS or ADC (gcloud auth application-default login) |
| Snowflake | snowflake |
✅ | SNOWFLAKE_USER, SNOWFLAKE_ACCOUNT, SNOWFLAKE_PRIVATE_KEY_PATH (key-pair JWT); optional SNOWFLAKE_WAREHOUSE, SNOWFLAKE_ROLE |
| MSSQL | mssql |
✅ | URL (mssql://user:pass@host:port/database; append ?encrypt=true for TLS) |
MCP server
schemadex ships an MCP server. Install with pip install "schemadex[mcp]" and wire it into Claude Code by adding to ~/.claude/mcp.json:
{
"mcpServers": {
"schemadex": {
"command": "schemadex-mcp",
"args": ["--url", "sqlite:///path/to/db.sqlite"]
}
}
}
The agent then has list_tables, describe_for_agent, resolve_column, and run_sql tools without any extra wiring.
Layout
schemadex/
├── crates/
│ ├── schemadex-core/ pure-Rust introspection + cache + resolve
│ └── schemadex-py/ PyO3 bindings (built as `schemadex._native`)
├── python/schemadex/ pure-Python public surface
├── examples/ langchain, langgraph, quickstart
├── benches/agent-success/ benchmark harness (see docs/benchmark.md)
└── docs/ architecture + benchmark methodology
Project status
Pre-1.0. API is still in motion. See road_map.md for milestone tracking.
License
Licensed under either of MIT or Apache-2.0 at your option.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file schemadex-1.0.0.tar.gz.
File metadata
- Download URL: schemadex-1.0.0.tar.gz
- Upload date:
- Size: 160.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.14 {"installer":{"name":"uv","version":"0.11.14","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8bf1740912da03f94b9036c27ee4981b7fd59de3a84903ee504bcd09ed23847b
|
|
| MD5 |
940e61be9dcdc918bc24e38deab20d9a
|
|
| BLAKE2b-256 |
f3933a70913e3773e0335d0a148c53327b2700d3debc98a1c6b3640434353e96
|
File details
Details for the file schemadex-1.0.0-cp39-abi3-win_amd64.whl.
File metadata
- Download URL: schemadex-1.0.0-cp39-abi3-win_amd64.whl
- Upload date:
- Size: 18.7 MB
- Tags: CPython 3.9+, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.14 {"installer":{"name":"uv","version":"0.11.14","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a077d3ed3e4d24cecb4a6d7f3ae3ac7d217220aad47ef546c4c264b37c6e159b
|
|
| MD5 |
a54626746547aa3f16b8c833e84c0428
|
|
| BLAKE2b-256 |
61ca435d5e11d1525de5f81a6551ce5bffc2b48c4ad26da364bd5c00d05294da
|
File details
Details for the file schemadex-1.0.0-cp39-abi3-musllinux_1_2_x86_64.whl.
File metadata
- Download URL: schemadex-1.0.0-cp39-abi3-musllinux_1_2_x86_64.whl
- Upload date:
- Size: 34.6 MB
- Tags: CPython 3.9+, musllinux: musl 1.2+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.14 {"installer":{"name":"uv","version":"0.11.14","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c110093f1111a420d85bef974afe97594cb60957e2e6a113ef46b5f4064dd14e
|
|
| MD5 |
b65b3ad904548cb1bbd02e1100c89446
|
|
| BLAKE2b-256 |
9ec14b57d84d76aa997496e8b5d72ef97276134de40ab7c2793922a58f280404
|
File details
Details for the file schemadex-1.0.0-cp39-abi3-manylinux_2_28_x86_64.whl.
File metadata
- Download URL: schemadex-1.0.0-cp39-abi3-manylinux_2_28_x86_64.whl
- Upload date:
- Size: 28.1 MB
- Tags: CPython 3.9+, manylinux: glibc 2.28+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.14 {"installer":{"name":"uv","version":"0.11.14","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e046da09bab2d49e80a04fa121dbda2d80e3f77413b223b9ff9a04ca6d229536
|
|
| MD5 |
cf4e3cede13f64485f32b4db268a2355
|
|
| BLAKE2b-256 |
4c2af34121abbd6a5a5266ccf1ecc18460a5da84b278952ac1c98a0941cad2a1
|
File details
Details for the file schemadex-1.0.0-cp39-abi3-macosx_11_0_arm64.whl.
File metadata
- Download URL: schemadex-1.0.0-cp39-abi3-macosx_11_0_arm64.whl
- Upload date:
- Size: 21.0 MB
- Tags: CPython 3.9+, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.14 {"installer":{"name":"uv","version":"0.11.14","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c5d3df6927f1ba559fc0399ba6d12abe42c7f6a896cd57e932512c1721fc5a5c
|
|
| MD5 |
d17211d405757b3960ed2d1b7b1570eb
|
|
| BLAKE2b-256 |
73f1864a92e733d33e6dd8d59b3a49b8ce5b9912b5e38bf405adf20e1fa25839
|
File details
Details for the file schemadex-1.0.0-cp39-abi3-macosx_10_12_x86_64.whl.
File metadata
- Download URL: schemadex-1.0.0-cp39-abi3-macosx_10_12_x86_64.whl
- Upload date:
- Size: 22.6 MB
- Tags: CPython 3.9+, macOS 10.12+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.14 {"installer":{"name":"uv","version":"0.11.14","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
38df7e7f570437de0bf1f335a995ec3c0a15efcd1a5fa765e75645ee3e21c727
|
|
| MD5 |
e577950a6e7e9d8e97e3984a04641c53
|
|
| BLAKE2b-256 |
40a496f23c848eed407f118c2d5d95c89a75e91be2b4a1d36cdddcfdc7154205
|