Deterministic, validation-first Text-to-SQL system for business databases

These details have not been verified by PyPI

Project links

Homepage

Project description

AetherDialect — validation-first Text-to-SQL for PostgreSQL and Databricks

aetherdialect turns analytical questions into read-only SELECT pipelines: a structured intent representation, multi-stage validation (including dialect EXPLAIN), template reuse from accepted answers, and bounded learning from rejections. The language model fills bounded slots in that intent; it does not author unconstrained SQL.

Why this exists

Teams need answers from relational data without shipping opaque generated SQL. AetherDialect targets analysts and integrators who want a repeatable path from question to result: the same question can return cached SQL with no model round-trip, schema drift surfaces as an explicit migration stop instead of silent breakage, and every generated statement is checked against the catalog and engine before it runs.

Install

pip install aetherdialect
pip install "aetherdialect[postgresql]"
pip install "aetherdialect[databricks]"
pip install "aetherdialect[postgresql,databricks]"

Requires Python 3.10 or newer. Configure the LLM and database via a TOML config_file (recommended) and/or process environment; the full key list lives in the API reference.

Quick start

from aetherdialect import SchemaContext, Text2SQL

t2s = Text2SQL(
    SchemaContext(),
    artifacts_dir="./my_run",
    config_file="./aetherdialect.toml",
)
t2s.run_interactive()

run_interactive prompts once per invocation; call it again for another question. For programmatic UIs, use Text2SQL.session() or Text2SQL.asession() and drive SessionStep objects — see the Integrator guide.

dry_run_warmup exercises a newline-delimited seed question file through validation and execution without persisting templates; see the User guide — Seed warmup.

What makes this different

Constant-learning cache: exact q_norm reuse returns SQL with zero LLM calls; near-paraphrases (token Levenshtein at most 2) reuse the same template with one bounded LLM call that only extracts parameters. (How it works)
Schema overrides are a JSON file you read, edit, and version. Every override (descriptions, roles, sensitivity, added or suppressed foreign keys, primary key endorsements) is replayed on every cache invalidation. (API reference)
Migration is never silent. When the catalog changes structurally, the engine writes a schema_migration_map.json skeleton and stops. You decide the action; it resumes. (User guide)
Generated SQL passes through four validation layers (intent JSON, dialect AST, schema/catalog alignment, dialect EXPLAIN). The LLM never emits raw SQL; it fills bounded slots in a structured intent IR. (Security)
Reader / writer split is built in. Many readers can ask questions; the engine drains write_queue.jsonl at the start of every writer-mode turn under the artifacts lock so learning persists without readers touching the partitioned template store files. (Integrator guide)

Documentation

Doc	When to read it
User guide	Install, first run, asking questions, notes, overrides, migration, seed warmup and dry-run warmup, pitfalls.
Integrator guide	Embedding patterns, sessions, multi-turn relay, threading, reader/writer split and queue, audit and diagnostics, cache reset.
API reference	Types, `config_file` TOML schema, methods, schema overrides JSON, diagnostic codes, exceptions.
How it works	Architecture diagrams, schema build, engine storage, question pipeline, migration, overrides, validation, learning model, configuration, observability, warmup/QSim, offline-mock design pointer.
Offline testing and mock LLM (design)	Planned mock LLM provider and fixture workflow for hermetic tests; links to `dev_workspace/mock.txt` (not on PyPI).
Security	Threat model, LLM context inventory, on-disk inventory, sensitivity model, deny lists, raw-SQL impossibility, EXPLAIN gate, network.
Support matrix	Per-engine table, IR-unsupported constructs and reformulations.

License

See LICENSE.

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

This version

0.1.7

May 20, 2026

0.1.6

May 6, 2026

0.1.5

May 5, 2026

0.1.4

May 5, 2026

0.1.3

Apr 30, 2026

0.1.2

Apr 10, 2026

0.1.1

Apr 6, 2026

0.1.0

Mar 30, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

aetherdialect-0.1.7.tar.gz (958.3 kB view details)

Uploaded May 20, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

aetherdialect-0.1.7-py3-none-any.whl (659.6 kB view details)

Uploaded May 20, 2026 Python 3

File details

Details for the file aetherdialect-0.1.7.tar.gz.

File metadata

Download URL: aetherdialect-0.1.7.tar.gz
Upload date: May 20, 2026
Size: 958.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.1

File hashes

Hashes for aetherdialect-0.1.7.tar.gz
Algorithm	Hash digest
SHA256	`a58fafd3fe3972bf655ef19f56e4f2c1209ca6117778e8350938f4649154c2de`
MD5	`a248328482b6e5dff6f02175b219eebb`
BLAKE2b-256	`477cbc09f953bd39cb7acccce11ed9f5f5ed415237fc1c959ddc924eb4f3f3e1`

See more details on using hashes here.

File details

Details for the file aetherdialect-0.1.7-py3-none-any.whl.

File metadata

Download URL: aetherdialect-0.1.7-py3-none-any.whl
Upload date: May 20, 2026
Size: 659.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.1

File hashes

Hashes for aetherdialect-0.1.7-py3-none-any.whl
Algorithm	Hash digest
SHA256	`328edc535d502a23084637db5a75ef403c6b1e7f8bb79e68e3065858c426f915`
MD5	`87457f0eaa330ba7f6079275636dc0aa`
BLAKE2b-256	`854a3186065bee1c9ebc247f7afe86555248801e010eb23998af43d418a012c2`

See more details on using hashes here.

aetherdialect 0.1.7

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

AetherDialect — validation-first Text-to-SQL for PostgreSQL and Databricks

Why this exists

Install

Quick start

What makes this different

Documentation

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes