Read-only database MCP proxy for AI - safe SELECT access with 5-layer defense

These details have not been verified by PyPI

Project links

Project description

`dbread`

Read-only database MCP proxy for AI — safe `SELECT` access with 5-layer defense

Why · Quickstart · Tools · Security Model · Docs

🤔 Why

Handing a raw database connection string to an AI is like handing a stranger your car keys. They probably won't crash it, but you wouldn't bet the car on it.

dbread sits between your AI and your DBs and enforces read-only access through five independent layers — if one layer has a bug, the next one still blocks you.

⚠️ Security note — do not skip. Layer 0 (a read-only DB user, step 2b) is the only non-bypassable guarantee. Layers 1–4 reduce blast radius and make attacks loud — they are not substitutes. If you point dbread at a DB where the configured user can write, a single sqlglot parser gap (past, present, or future) can let a write through. See Known Limitations.

⚡ Quickstart (2 minutes, no clone needed)

1. Install as a tool

# From PyPI (recommended):
uv tool install "dbread[postgres]"          # extras: postgres, mysql, mssql, oracle, duckdb, clickhouse

# OR straight from GitHub (no PyPI needed):
uv tool install "git+https://github.com/tvtdev94/dbread[postgres]"

2. Scaffold config (one command)

dbread init

Creates ~/.dbread/config.yaml, ~/.dbread/.env, and ~/.dbread/sample.db (a tiny read-only SQLite demo so everything works immediately). Prints the exact claude mcp add line to paste in step 4. Skip to step 4 if you only want the demo; otherwise edit config.yaml / .env first (step 3).

2b. Create a read-only DB user (when pointing at a real DB)

See docs/setup-db-readonly.md — copy-paste SQL snippets for PostgreSQL / MySQL / MSSQL / Oracle / SQLite / DuckDB / ClickHouse, plus compat notes for CockroachDB · Timescale · Aurora · SingleStore · PlanetScale · Yugabyte.

3. Create `config.yaml` + `.env`

# ~/.dbread/config.yaml
connections:
  mydb:
    url_env: MYDB_URL
    dialect: postgres
    rate_limit_per_min: 60
    statement_timeout_s: 30
    max_rows: 1000
audit:
  path: ~/.dbread/audit.jsonl   # ~ expansion supported
  rotate_mb: 50                  # rotates current → .1 → .2 → .3 (oldest dropped)
  timezone: UTC                  # IANA name; default UTC
  redact_literals: false         # true → SQL literals become "?" in log (PII hardening)

# ~/.dbread/.env
MYDB_URL=postgresql+psycopg2://ai_readonly:password@host:5432/mydb

4. Register with Claude Code

claude mcp add --scope user dbread \
  --env DBREAD_CONFIG=/path/to/config.yaml \
  -- dbread

Or without install (one-shot via uvx):

claude mcp add --scope user dbread \
  --env DBREAD_CONFIG=/path/to/config.yaml \
  -- uvx --from "dbread[postgres]" dbread

5. Use it

Restart Claude Code → /mcp → dbread appears. Ask Claude: "list connections in dbread, then count rows per status in the orders table."

Alternative: clone the repo (for development)

git clone https://github.com/tvtdev94/dbread && cd dbread
uv sync --extra postgres --extra dev
cp config.example.yaml config.yaml && cp .env.example .env
claude mcp add --scope user dbread -- uv --directory $(pwd) run dbread

Ask Claude: "List connections in dbread, then count rows per status in the orders table."

🏗️ Architecture

Data flow for a query call:

sequenceDiagram
    participant AI as Claude
    participant T as tools.query
    participant G as SqlGuard
    participant R as RateLimiter
    participant D as Database
    participant A as Audit

    AI->>T: query(sql, connection)
    T->>G: validate(sql, dialect)
    alt SQL is DML/DDL
        G-->>T: rejected
        T->>A: log(rejected, reason)
        T-->>AI: ❌ sql_guard error
    else SQL is SELECT
        G->>T: ✓ plus inject LIMIT N
        T->>R: acquire(connection)
        alt Rate limit hit
            R-->>T: denied
            T->>A: log(rejected, rate_limit)
            T-->>AI: ❌ rate_limit_exceeded
        else Rate limit OK
            R->>T: ✓
            T->>D: execute(sql)
            D-->>T: rows
            T->>A: log(ok, rows, ms)
            T-->>AI: ✅ rows JSON
        end
    end

🧰 Tools

Tool	Purpose	Input
`list_connections`	Configured connections + dialects	—
`list_tables`	Tables in a connection	`connection`, `schema?`
`describe_table`	Columns, types, nullability, PKs, indexes	`connection`, `table`, `schema?`
`query`	Run `SELECT`/`WITH`/`EXPLAIN`/`SHOW`. Auto-limited. Rate-limited. Audited.	`connection`, `sql`, `max_rows?`
`explain`	Query execution plan	`connection`, `sql`

🛡️ Security Model

Layer	Mechanism	What it rejects
0	DB user with `GRANT SELECT` only	All writes — mandatory, non-bypassable
1	`sqlglot` AST validation	`INSERT` · `UPDATE` · `DELETE` · `MERGE` · `CREATE` · `ALTER` · `DROP` · `TRUNCATE` · `GRANT` · `REVOKE` · multi-statement (`SELECT 1; DROP...`) · PG CTE-DML trick (`WITH d AS (DELETE...) SELECT...`) · time-based DoS (`pg_sleep*`, `sleep`, `benchmark`, MSSQL `WAITFOR DELAY/TIME`) · function blacklist (`pg_read_file`, `xp_cmdshell`, `load_file`, `dblink_exec`, ClickHouse `url`/`s3`/`remote`, DuckDB `read_csv`/`read_parquet`, …)
2	Rate limit + `statement_timeout`	Runaway loops · long-running queries
3	Auto-inject `LIMIT N`	Oversized result sets
4	JSONL audit log (`fsync` each write, 3-backup rotate, opt-in PII redact)	(detection, not prevention — grep-friendly forensics)

💡 Principle: Never rely on a single layer. Layer 0 is the guarantee; Layers 1–4 make attacks loud and rare.

Full threat model: docs/security-threat-model.md (STRIDE analysis).

⚡ Overhead

dbread adds guard + limit-injection work on every query. Rough p95 per call, measured in-process (no DB round-trip):

Workload	guard.validate	guard.inject_limit	total overhead
`SELECT 1`	~0.17 ms	~0.44 ms	~0.6 ms
Realistic WHERE + ORDER BY	~0.65 ms	~1.28 ms	~1.9 ms
5-CTE 10-table join	~3.1 ms	~4.8 ms	~7.9 ms

Rate-limit acquire: ~1 µs. Run uv run python scripts/benchmark_overhead.py on your box. Full methodology: docs/benchmarks.md.

📋 Example Prompts

💬 "List connections in dbread."
💬 "Describe the schema of the orders table in analytics_prod."
💬 "Top 10 customers by lifetime value — use dbread."
💬 "Run EXPLAIN on: SELECT ... ORDER BY created_at"

💬 "Update user 1 to 'hacked'."
   → ❌ sql_guard: node_rejected: Update

💬 "WITH d AS (DELETE FROM users RETURNING *) SELECT * FROM d"
   → ❌ sql_guard: node_rejected: Delete   (PG CTE-DML blocked)

💬 "SELECT 1; DROP TABLE users;"
   → ❌ sql_guard: multi_statement_not_allowed

📜 Audit Log

Every call lands in audit.jsonl — one JSON per line, append-only, fsync'd on each write (survives kill -9), auto-rotated at 50 MB through a 3-backup chain (.1 → .2 → .3).

{"ts":"2026-04-22T12:30:12+00:00","conn":"analytics","sql":"SELECT * FROM users LIMIT 100","rows":100,"ms":42,"status":"ok"}
{"ts":"2026-04-22T12:30:15+00:00","conn":"analytics","sql":"DELETE FROM users","rows":0,"ms":0,"status":"rejected","reason":"node_rejected: Delete"}

Default timezone is UTC; override with audit.timezone: Asia/Bangkok (IANA). Enable audit.redact_literals: true to rewrite SQL literals to ? before logging — handy when prompts may contain PII.

jq 'select(.status=="rejected")' audit.jsonl     # just rejections
jq 'select(.ms > 1000)' audit.jsonl              # slow queries
jq -s 'group_by(.status)|map({s:.[0].status,n:length})' audit.jsonl   # counts

No jq? Use the built-in analyzer:

dbread audit                     # summary: counts, top slow, top rejected
dbread audit --since 1h          # last hour only
dbread audit --conn analytics    # filter by connection
dbread audit --slow 1000         # queries >= 1000 ms
dbread audit --rejected          # only rejections, grouped by reason
dbread audit --tail              # follow new entries (like tail -f)

Rotated backups (.1 · .2 · .3) are aggregated automatically.

🗂️ Config

config.yaml (gitignored — safe to edit with real values):

connections:
  analytics_prod:
    url_env: ANALYTICS_PROD_URL        # credentials from .env
    dialect: postgres
    rate_limit_per_min: 60
    statement_timeout_s: 30
    max_rows: 1000

  local_mysql:
    url: mysql+pymysql://readonly:pw@localhost/shop
    dialect: mysql
    rate_limit_per_min: 120
    statement_timeout_s: 15
    max_rows: 500

  local_duckdb:
    url: duckdb:///./analytics.duckdb?access_mode=read_only
    dialect: duckdb
    rate_limit_per_min: 200
    statement_timeout_s: 30
    max_rows: 5000

  clickhouse_prod:
    url_env: CLICKHOUSE_URL            # clickhouse+http://readonly:pw@host:8123/db
    dialect: clickhouse
    rate_limit_per_min: 60
    statement_timeout_s: 30
    max_rows: 1000

audit:
  path: ~/.dbread/audit.jsonl         # ~ expansion supported
  rotate_mb: 50                        # rotate chain: current → .1 → .2 → .3
  timezone: UTC                        # IANA; default UTC
  redact_literals: false               # true → SQL literals → "?"

Supported dialects: postgres · mysql · mssql · sqlite · oracle · duckdb · clickhouse.

Compat (no new dialect): CockroachDB, TimescaleDB, Aurora PG (use postgres) · Aurora MySQL, SingleStore, PlanetScale (use mysql). See docs/setup-db-readonly.md.

🧪 Testing

uv sync --extra dev
uv run pytest                          # 122 passing
uv run pytest --cov=dbread             # coverage report (92% overall, 85% server.py)
uv run ruff check src/                 # lint

# Integration tests with real PG + MySQL + ClickHouse (needs Docker):
cd tests/integration && docker compose up -d
uv run pytest tests/integration/ -v

~110 unit tests cover config, connections, audit (fsync/tz/redact/rotate), SQL guard (57 evasion cases incl. WAITFOR & sleep variants), rate limiter, tools.
4 subprocess smoke tests drive server.py via real stdio JSON-RPC.
4 SQLite + 4 DuckDB E2E tests always run (no Docker).
4 PG + 4 MySQL + ClickHouse E2E tests skip gracefully without Docker.
CI runs on GitHub Actions matrix: Python 3.11/3.12 × Ubuntu/Windows.

⚠️ Known Limitations

Honesty pass — what dbread does not do:

sqlglot is best-effort, dialect-dependent. Coverage is strong for Postgres / MySQL / SQLite; medium for MSSQL / Oracle / ClickHouse / DuckDB. See the dialect coverage table. Function blacklists are deny-lists; new dialect features arrive between releases.
Rate limit is single-process, in-memory. Multiple dbread processes (multi-user install) don't share buckets. global_rate_limit_per_min caps one process only.
Audit is reactive, not preventive. JSONL + dbread audit help you notice; they don't block.
No query cost estimator. Layer 2 has statement_timeout and LIMIT N, but an expensive index-less scan that finishes in time still runs.
Pre-1.0 project. Real-world adversarial testing accumulates over time. Treat dbread as one layer of defense, not the whole perimeter.

📚 Docs

Document	What's in it
`docs/setup-db-readonly.md`	Copy-paste SQL for Layer 0 DB user on PG / MySQL / MSSQL / Oracle / SQLite
`docs/architecture.md`	Component diagram · 5-layer details · data flow · design decisions
`docs/security-threat-model.md`	Full STRIDE analysis · residual risks · response plan
`docs/manual-smoke-test.md`	Step-by-step checklist for verifying integration with Claude Code

🧱 Project Layout

src/dbread/
├── server.py         # MCP stdio entry — registers 5 tools
├── tools.py          # tool handlers (guard → limit → rate → exec → audit)
├── sql_guard.py      # sqlglot AST validator + LIMIT injection
├── rate_limiter.py   # thread-safe token bucket per connection
├── connections.py    # SQLAlchemy engine manager (lazy, per-dialect)
├── config.py         # pydantic Settings (YAML + env)
└── audit.py          # append-only JSONL with size rotation

Every source file is under 200 LOC — designed to be readable end-to-end.

🙏 Credits

Built with mcp · sqlglot · SQLAlchemy 2.x · pydantic · uv.

_{Made with ❤️ for developers who want AI productivity without giving up database safety.}

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.8.0

Apr 28, 2026

0.7.9

Apr 26, 2026

0.7.8

Apr 26, 2026

0.7.7

Apr 26, 2026

0.7.6

Apr 26, 2026

0.7.5

Apr 26, 2026

0.7.4

Apr 26, 2026

0.7.3

Apr 26, 2026

0.7.2

Apr 26, 2026

0.7.1

Apr 26, 2026

0.7.0

Apr 26, 2026

0.6.0

Apr 23, 2026

0.5.0

Apr 23, 2026

0.4.3

Apr 23, 2026

0.4.2

Apr 23, 2026

0.4.1

Apr 23, 2026

0.4.0

Apr 23, 2026

This version

0.3.0

Apr 23, 2026

0.2.2

Apr 22, 2026

0.2.1

Apr 22, 2026

0.2.0

Apr 22, 2026

0.1.1

Apr 22, 2026

0.1.0

Apr 22, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dbread-0.3.0.tar.gz (172.0 kB view details)

Uploaded Apr 23, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

dbread-0.3.0-py3-none-any.whl (25.6 kB view details)

Uploaded Apr 23, 2026 Python 3

File details

Details for the file dbread-0.3.0.tar.gz.

File metadata

Download URL: dbread-0.3.0.tar.gz
Upload date: Apr 23, 2026
Size: 172.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.0

File hashes

Hashes for dbread-0.3.0.tar.gz
Algorithm	Hash digest
SHA256	`4e0c85dd0e1610d5eca26f89deca97e3ee151ea91c117f0d27291326a0c85b37`
MD5	`d9d8463b85c2a11716e8ef4a8c52b164`
BLAKE2b-256	`4f655360f52eed9d99c4c526bfc270cd15d6b52a2a7425b33e5e58d5f49f06a3`

See more details on using hashes here.

File details

Details for the file dbread-0.3.0-py3-none-any.whl.

File metadata

Download URL: dbread-0.3.0-py3-none-any.whl
Upload date: Apr 23, 2026
Size: 25.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.0

File hashes

Hashes for dbread-0.3.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`ac791dd8d1394cbf640f04e3b1a0fdfe0800bfc614f4dba3052006d61212eb55`
MD5	`3b3f80c43c8f918ceae68e47bb76623c`
BLAKE2b-256	`a67b9692e66086e251b349afcd1d4ffb5a16fb48b32b5ddcecbf7b44e3dcfef6`

See more details on using hashes here.

dbread 0.3.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

dbread

Read-only database MCP proxy for AI — safe SELECT access with 5-layer defense

🤔 Why

⚡ Quickstart (2 minutes, no clone needed)

1. Install as a tool

2. Scaffold config (one command)

2b. Create a read-only DB user (when pointing at a real DB)

3. Create config.yaml + .env

4. Register with Claude Code

5. Use it

🏗️ Architecture

🧰 Tools

🛡️ Security Model

⚡ Overhead

📋 Example Prompts

📜 Audit Log

🗂️ Config

🧪 Testing

⚠️ Known Limitations

📚 Docs

🧱 Project Layout

🙏 Credits

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`dbread`

Read-only database MCP proxy for AI — safe `SELECT` access with 5-layer defense

3. Create `config.yaml` + `.env`