Tree-sitter grammar for SQLite's SQL dialect plus dot-commands, faithful to upstream parse.y.
Project description
tree-sitter-sqlite3
A tree-sitter grammar for SQLite's SQL dialect plus the sqlite3 CLI
dot-commands. Translated from upstream
parse.y and
tokenize.c —
every production, precedence rule, and %fallback mirrored.
Tracks sqlite 3.47.0. Bindings: c, go, node, python, rust, swift.
Validated on every push by seven harnesses totalling ~80 000 SQL
inputs — including a differential against libsqlite3 3.47.0 over
38 043 fragments extracted from sqlite's own test/*.test, plus
libFuzzer + ASAN. Zero unallowlisted "sqlite-accepts / we-reject"
divergences. See Validation.
Coverage
Full DML / DDL / CTEs (incl. recursive) / window functions /
compound SELECT / upsert / RETURNING / generated columns /
STRICT / WITHOUT ROWID / dot-commands / ATTACH / PRAGMA /
VACUUM / REINDEX / ANALYZE / EXPLAIN / SAVEPOINT / transactions.
sqlite 3.44+ syntax included (aggregate-arg ORDER BY,
RIGHT/FULL JOIN, UPDATE FROM, vector-form SET (a,b)=(...),
VACUUM INTO <expr>, NULLS FIRST/LAST, count(DISTINCT)).
Queries: highlights.scm, locals.scm, tags.scm.
Validation
CI runs seven harnesses on every push (~80 000 inputs total):
| harness | inputs | bar |
|---|---|---|
tree-sitter test (hand-written corpus) |
147 | 100 % |
upstream-corpus (sqlite's own test/*.test) |
38 043 | ≥ 99.5 % |
| differential vs libsqlite3 3.47.0 | 38 043 | 0 unallowlisted SS-AR |
| grammar-coverage (every named node type hit) | 100 types | 100 % |
| snapshot regression (byte-exact s-exprs) | 147 | byte-exact |
| extras-placement (comments between every adjacent token pair) | 1 220 | 100 % |
| roundtrip property (range / leaf-concat / monotonicity) | 147 | 100 % |
Plus libFuzzer + ASAN on the parser .so and a mutation fuzzer
against libsqlite3.
An external scanner (src/scanner.c) handles lexer-level strictness
(malformed blob/numeric literals, number-fused-to-identifier).
Scope
Syntactic only. Mirrors tokenize.c + parse.y, not the
semantic-validation layer that runs during
sqlite3_prepare_v2's code-gen. ~78 inputs we accept get rejected
by sqlite at runtime (build-flag-dependent productions, parse-time
semantic checks); see docs/allowlists.md
for the taxonomy. Layer your own semantic checks on top.
Build
Inside the dev container:
docker compose build
docker compose run --rm dev tree-sitter generate
docker compose run --rm dev tree-sitter test
Or with a host tree-sitter-cli@0.25 and parser.c already
checked in: tree-sitter test works directly from a fresh clone.
Upstream tracking
Vendored under vendor/ with sha256 pins and an update runbook
(vendor/README.md):
parse.y— productions, precedence,%fallback.tokenize.c— character classes, literal forms, comments.mkkeywordhash.c— canonical keyword list + masks.shell.c— dot-command list (sourced separately from parse.y).
Update loop: bump vendor → diff parse.y → mirror in grammar.js
→ add fixtures → tree-sitter generate → commit src/.
Translation notes
%fallback:_identifieraschoice(identifier, ...keyword_tokens).%wildcard ANY: ambiguity resolved viaconflicts.%ifdef SQLITE_OMIT_*: always parse the un-OMITform.- Lemon semantic actions (C blocks): not translated; downstream consumers do semantic validation.
License
CC0-1.0 (mirrors SQLite's public-domain stance). Vendored sqlite
sources under vendor/ are themselves public-domain per
https://www.sqlite.org/copyright.html.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file tree_sitter_sqlite3-0.1.0.tar.gz.
File metadata
- Download URL: tree_sitter_sqlite3-0.1.0.tar.gz
- Upload date:
- Size: 291.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fcd0ed78d67505154ad257a855bce18db4fc83cb38520c2891a7c330ad8885c6
|
|
| MD5 |
24129506212a63d993cd290f2de325c4
|
|
| BLAKE2b-256 |
4d2a5317fdbcd573182fa993d30163b96bb83b6d69243a31085ae1a262b9bcd8
|
Provenance
The following attestation bundles were made for tree_sitter_sqlite3-0.1.0.tar.gz:
Publisher:
publish.yml on defin/tree-sitter-sqlite3
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
tree_sitter_sqlite3-0.1.0.tar.gz -
Subject digest:
fcd0ed78d67505154ad257a855bce18db4fc83cb38520c2891a7c330ad8885c6 - Sigstore transparency entry: 1435062289
- Sigstore integration time:
-
Permalink:
defin/tree-sitter-sqlite3@7f69bb66845beaac48d467f4f7d107ea2002865e -
Branch / Tag:
refs/heads/main - Owner: https://github.com/defin
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@7f69bb66845beaac48d467f4f7d107ea2002865e -
Trigger Event:
workflow_dispatch
-
Statement type: