AST-driven SQL pattern extraction and index recommendation engine.

These details have not been verified by PyPI

Project links

Project description

AuraIndex

AuraIndex is an AST-driven index recommendation engine for databases. It statically scans application code, extracts SQL and ORM query patterns, parses them into a SQL AST, aggregates workload-style usage signals, and proposes guarded index recommendations.

Why this project

Traditional index tuning tools depend on live database telemetry. AuraIndex targets earlier design stages by mining query intent from source code:

Language AST frontends extract query intent from code.
A shared SQL AST backend normalizes patterns across sources.
Heuristics generate index suggestions with explicit confidence and review flags.

This architecture makes it practical to support multiple languages over time while keeping one recommendation core.

Current scope

Tree-sitter WASM extraction for Python, JavaScript/TypeScript, and PHP
- raw SQL call detection (execute, executemany, query, raw, text)
- SQL-like query builder chain lifting (where/filter/join/order/group patterns)
SQL file extraction (.sql)
SQL AST analysis with sqlglot
Index heuristics:
- single-column filter keys
- join keys
- composite filter keys
- where+order composite patterns
Impact model (enabled by default):
- read benefit vs write amplification scoring
- net impact adjustment on suggestion ranking
Schema-aware checks (enabled by default):
- existing index detection from SQL DDL (CREATE INDEX)
- exact/prefix overlap detection (including unique/partial overlaps)
- automatic suppression of already-covered suggestions
Migration artifact generation (enabled by default):
- up/down SQL script for suggested indexes
Optional EXPLAIN validation:
- sqlite built-in runner (EXPLAIN QUERY PLAN)
- custom external command runner for postgres/mysql
Engine-aware index type hints (enabled by default):
- btree / hash / brin / gin heuristics
- typed DDL variant rendering (when applicable)
Schema drift watch (enabled by default):
- possibly unused existing index candidates
- missing index candidate summary
Query optimization advisories (enabled by default):
- SELECT * detection
- ORDER BY without LIMIT
- leading wildcard LIKE warnings
Guardrails:
- deduplication
- confidence scoring
- per-table cap on suggestions
- manual-review flags
Unified Tree-sitter WASM frontend (web-tree-sitter) for supported languages

Architecture

Extractors (language-specific frontend)
- WasmTreeSitterExtractor invokes a Node script using Tree-sitter WASM for Python/JS/TS/PHP.
- extract_sql_file loads SQL statements directly from .sql files.
SQL AST analyzer (shared backend)
- Parses candidate SQL with sqlglot.
- Extracts tables, predicate columns, join columns, ordering/grouping columns.
- Builds query-level feature records.
Workload aggregation
- Counts column participation by role (WHERE, JOIN, ORDER BY, GROUP BY).
- Tracks composite filter patterns and where+order pair frequency.
Index suggestion engine
- Applies heuristic scoring with confidence.
- Applies impact scoring (read gain vs write cost).
- Performs schema-aware overlap checks against discovered existing indexes.
- Emits dialect-aware CREATE INDEX statements.
- Adds ORM snippet equivalents for SQLAlchemy and Django.

Installation

cd AuraIndex
python -m venv .venv
.venv\Scripts\activate
pip install -e .[dev]
npm install

For live database introspection (Postgres/MySQL drivers):

pip install -e .[dev,db]

After publishing to PyPI, install the package with:

pip install AuraIndex

Quick start

auraindex --path . --dialect postgres --min-support 2 --show-top

Outputs:

By default, AuraIndex writes reports to a temp folder: %TEMP%\auraindex-report-*
On completion, AuraIndex automatically opens the interactive HTML report in your default browser.
Artifacts:
- auraindex_report.json: extraction, parsing, workload, and suggestion details
- auraindex_indexes.sql: executable SQL index script (DDL = Data Definition Language)
- auraindex_migration.sql: up/down migration SQL (create + rollback)
- auraindex_orm_snippets.md: ORM equivalents
- auraindex_report.md: detailed human-readable report with ranking and evidence
- auraindex_report.html: interactive report with filters, copy buttons, and evidence drill-down

Useful CLI flags

--report-dir D:\reports\scan-01: write outputs to a fixed directory.
--json-out custom.json, --ddl-out custom.sql, --migration-out migration.sql, --orm-out custom.md, --md-out details.md, --html-out ui.html
--show-top: print ranked recommendations in terminal.
--limit-suggestions 25: optionally limit ranked suggestions shown in markdown/HTML/terminal; by default AuraIndex shows all suggestions.
--strict-parse: exit code 2 when parse failures are found.
--disable-wasm: scan only .sql files (debug mode).
--disable-impact-scoring: turn off net impact rank adjustment.
--disable-schema-checks: turn off existing-index overlap suppression.
--disable-index-type-hints: turn off engine-aware index type recommendation metadata.
--disable-drift-watch: turn off schema drift watch output.
--disable-query-advisories: turn off query optimization advisories.
--max-query-advisories 150: cap advisory rows in report metadata.
--enable-db-introspection: enable live DB schema/index introspection.
--db-dialect postgres|mysql|sqlite: live DB dialect (defaults to --dialect if omitted).
--db-host, --db-port, --db-user, --db-password / --db-password-env, --db-name
--db-path (sqlite), --db-schema (postgres schema filter), --db-timeout-seconds
--disable-migration-output: skip writing migration SQL artifact.
--explain-validate --explain-db-path D:\db.sqlite: run sqlite EXPLAIN validation.
--explain-validate --explain-command "psql ... -c \"EXPLAIN {sql}\"": run custom EXPLAIN command.

Use help anytime:

auraindex --help

WASM grammar files (required for source-code scanning)

npm install tree-sitter-python tree-sitter-javascript tree-sitter-typescript tree-sitter-php
auraindex --path .

AuraIndex auto-detects grammars from local node_modules.

Default excluded directories include generated assets and build outputs:

.git, node_modules, __pycache__, .venv, venv
build, dist, staticfiles, .next

This avoids parse noise from minified vendor bundles in production/static folders.

Cardinality hints (optional)

You can provide a JSON file to bias recommendations:

{
  "users.email": "high",
  "orders.status": "low"
}

Use:

auraindex --path . --cardinality-hints cardinality_hints.json

Hints are guardrails only; they do not replace runtime telemetry.

Example report fields

auraindex_report.json includes:

summary: scanned files, candidate count, parse success/failures, suggestion count
candidates: extracted SQL with file, line, function, source_kind
workload: per-table and per-column usage metrics
suggestions: index DDL, confidence, evidence, impact metadata, schema checks, and manual-review flags
detailed_index_evidence: per-index support counts, matched query excerpts, and source locations

How confidence is computed

Confidence is a bounded heuristic score in [0.50, 0.98], not a probabilistic guarantee.

Join-key suggestions start higher (~0.80) and rise with repeated JOIN evidence.
Filter-key suggestions start lower (~0.74) and rise with repeated WHERE evidence.
Composite suggestions start higher (~0.82-0.84) because repeated multi-column patterns are stronger signals.
Cardinality hints can nudge confidence (high raises, low lowers).
Manual-review flag is set for lower-confidence suggestions and wider composites.

In short: more repeated, structurally strong evidence => higher confidence.

What `--min-support` means

--min-support is the minimum repetition threshold before AuraIndex recommends an index.

Example:

--min-support 1: include one-off patterns (higher recall, more noise)
--min-support 3: include only repeated patterns (cleaner, more conservative)

Use lower values for exploration, higher values for production-oriented recommendations.

Composite index handling

AuraIndex explicitly models composite opportunities and prunes redundant suggestions:

Detects repeated multi-column WHERE patterns
Detects WHERE + ORDER BY patterns for left-prefix composites
Suppresses redundant single-column suggestions when a stronger composite prefix exists
Limits per-table output via --max-indexes-per-table

Impact scoring model (default ON)

AuraIndex estimates whether an index is likely to be worth it:

Read benefit: weighted by WHERE/JOIN/ORDER/GROUP usage and support evidence.
Write cost: estimated from table write ratio and index width.
Net impact = read benefit - write cost.

This net impact adjusts ranking (score_adjustment) and is shown in JSON/Markdown/HTML reports.

Disable with --disable-impact-scoring when you want raw heuristic-only ranking.

Schema-aware checks (default ON)

AuraIndex scans discovered SQL DDL and extracts existing indexes, then checks each suggestion for:

exact match (skip_exact_existing)
covered by existing left-prefix (skip_covered_by_existing_prefix)
narrower overlap requiring review (review_overlaps_existing_prefix)
unique/partial overlap signals

Suggestions already covered by existing indexes are skipped by default to reduce noise.

Disable with --disable-schema-checks if you want full raw suggestions.

With --enable-db-introspection, AuraIndex also pulls existing indexes directly from live DB metadata:

PostgreSQL: pg_indexes
MySQL: INFORMATION_SCHEMA.STATISTICS
SQLite: PRAGMA index_list + PRAGMA index_info

This significantly improves practical overlap detection when schema is not fully represented in code migrations.

Migration SQL (up/down)

AuraIndex emits auraindex_migration.sql with:

Up: suggested CREATE INDEX ...
Down: corresponding rollback (DROP INDEX ...)

Disable file generation with --disable-migration-output.

Optional EXPLAIN validation

AuraIndex can annotate suggestions with execution-plan checks:

SQLite mode: --explain-validate --explain-db-path <path>
External mode: --explain-validate --explain-command "<command with {sql}>"

Results are included in report metadata and rendered in Markdown/HTML.

Engine-aware index type hints (default ON)

AuraIndex recommends an index access method per suggestion (dialect-aware):

btree as baseline default
hash for equality-heavy single-column patterns (low write ratio)
brin for time-correlated columns in append-style workloads
gin for JSON/metadata-like columns
ivfflat hint for vector-like columns (manual operator-class tuning required)

The report includes rationale and typed DDL variant when safely renderable.

Disable with --disable-index-type-hints.

Schema drift watch (default ON)

AuraIndex emits drift-oriented metadata:

possibly unused existing indexes (non-unique indexes with no observed column usage)
missing index candidates from current scan

Rendered in Markdown/HTML under Schema Drift Watch.

Disable with --disable-drift-watch.

When live DB introspection is enabled, drift watch uses live index inventory as the primary signal.

Query optimization advisories (default ON)

AuraIndex includes static query-level improvement hints:

SELECT * projection warning
ORDER BY without LIMIT warning
leading wildcard LIKE '%x' warning

Rendered in Markdown/HTML under Query Optimization Advisories.

Disable with --disable-query-advisories.

Latest end-to-end validation snapshot

Run command:

auraindex --path D:\SuccessPilot --dialect postgres --min-support 2 --report-dir D:\AutoCure\AuraIndex\reports\successpilot-strong-adds

Observed output snapshot (latest run):

scanned files: 1002
query candidates: 62
parsed queries: 62
parse failures: 0
suggested indexes: 17
schema checks: existing index discovery and overlap checks enabled by default

Generated artifacts:

auraindex_report.json
auraindex_indexes.sql
auraindex_migration.sql
auraindex_orm_snippets.md
auraindex_report.md
auraindex_report.html

Live DB mode examples:

# PostgreSQL
auraindex --path . --dialect postgres --enable-db-introspection --db-dialect postgres --db-host localhost --db-port 5432 --db-user app --db-password-env DB_PASSWORD --db-name appdb --db-schema public

# MySQL
auraindex --path . --dialect mysql --enable-db-introspection --db-dialect mysql --db-host localhost --db-port 3306 --db-user app --db-password-env DB_PASSWORD --db-name appdb

# SQLite
auraindex --path . --dialect sqlite --enable-db-introspection --db-dialect sqlite --db-path D:\data\app.db

Per-service verification (SuccessPilot microservices):

ai-worker: fail=0
analytics: fail=0
backend: fail=0 (after excluding staticfiles generated assets)
frontend: fail=0 (after excluding build artifacts)
frontend_new: fail=0
live-updates: fail=0
onboarding: fail=0

Limitations

Static analysis cannot observe runtime frequency, parameter distributions, or write pressure.
ORM-to-SQL reconstruction is approximate for complex query builders.
EXPLAIN validation is optional and environment-dependent (DB connectivity and compatible SQL needed).
Suggestions are intentionally conservative and should still be reviewed before production rollout.

Testing

pytest -q

Build and publish (PyPI)

python -m pip install -U build twine
python -m build
twine check dist/*
# twine upload dist/*

Release checklist:

Package name available on PyPI.
Version bumped (pyproject.toml).
Long description renders (twine check passes).
LICENSE and README.md included in source distribution.

Roadmap

Add richer ORM semantic lifting (SQLAlchemy Core/ORM constructs).
Add log-based dynamic signal fusion.
Add adaptive suggestion ranking using execution metrics.
Expand language frontends while retaining the shared SQL AST backend.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.1.4

Mar 15, 2026

0.1.3

Mar 15, 2026

0.1.1

Mar 15, 2026

This version

0.1.0

Mar 14, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

auraindex-0.1.0.tar.gz (54.3 kB view details)

Uploaded Mar 14, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

auraindex-0.1.0-py3-none-any.whl (47.4 kB view details)

Uploaded Mar 14, 2026 Python 3

File details

Details for the file auraindex-0.1.0.tar.gz.

File metadata

Download URL: auraindex-0.1.0.tar.gz
Upload date: Mar 14, 2026
Size: 54.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.4

File hashes

Hashes for auraindex-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`97a1e5b24e306490f161a6f3340d36e95043a6a6fe84a3178f0d3b94f6cd9291`
MD5	`e807cd24d1cdc08ca0b433ad75ed40c1`
BLAKE2b-256	`355efa4979100cba1e27ccb2db0a11ac1a3505fc5da29ebb4606d3fcee507368`

See more details on using hashes here.

File details

Details for the file auraindex-0.1.0-py3-none-any.whl.

File metadata

Download URL: auraindex-0.1.0-py3-none-any.whl
Upload date: Mar 14, 2026
Size: 47.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.4

File hashes

Hashes for auraindex-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`5bbba991a3a4e621f83e675526ff5135a83473f56144eb54c16b5616a3494390`
MD5	`58e3fa9080eef3504add4fa9db0eeade`
BLAKE2b-256	`a31f1d0de676876bc1942c424c992b319ea7a818d33fa754ced8c537d16a0a68`

See more details on using hashes here.

AuraIndex 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

AuraIndex

Why this project

Current scope

Architecture

Installation

Quick start

Useful CLI flags

WASM grammar files (required for source-code scanning)

Cardinality hints (optional)

Example report fields

How confidence is computed

What --min-support means

Composite index handling

Impact scoring model (default ON)

Schema-aware checks (default ON)

Migration SQL (up/down)

Optional EXPLAIN validation

Engine-aware index type hints (default ON)

Schema drift watch (default ON)

Query optimization advisories (default ON)

Latest end-to-end validation snapshot

Limitations

Testing

Build and publish (PyPI)

Roadmap

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

What `--min-support` means