Skip to main content

Ephemeral SQL index over a local directory

Project description

dirsql (Python SDK)

Ephemeral SQL index over a local directory. Watches a filesystem, ingests structured files into an in-memory SQLite database, and exposes a SQL query interface. The database is purely in-memory -- the filesystem is always the source of truth.

Documentation

Installation

pip install dirsql

Requires Python >= 3.12. Ships as a native extension (Rust via PyO3) -- binary wheels are provided for common platforms.

Each wheel also bundles the dirsql HTTP-server CLI as a console script, so pip install dirsql also gives you a dirsql command on $PATH. See the CLI guide.

Publishing (maintainers)

Handled by .github/workflows/publish.yml (invoked from minor-release.yml / patch-release.yml). For each target triple the build job cargo builds the Rust CLI with --features cli, stages the binary into python/dirsql/_binary/, runs maturin build (which picks the binary up via the [tool.maturin] include rule in pyproject.toml), and the wheels + sdist are then trusted-published to PyPI.

Quick Start

import asyncio
import json
import os
import tempfile
from dirsql import DirSQL, Table

async def main():
    # Create some data files
    root = tempfile.mkdtemp()
    os.makedirs(os.path.join(root, "comments", "abc"), exist_ok=True)
    os.makedirs(os.path.join(root, "comments", "def"), exist_ok=True)

    with open(os.path.join(root, "comments", "abc", "index.jsonl"), "w") as f:
        f.write(json.dumps({"body": "looks good", "author": "alice"}) + "\n")
        f.write(json.dumps({"body": "needs work", "author": "bob"}) + "\n")

    with open(os.path.join(root, "comments", "def", "index.jsonl"), "w") as f:
        f.write(json.dumps({"body": "agreed", "author": "carol"}) + "\n")

    # Define a table: DDL, glob pattern, and an extract function
    db = DirSQL(
        root,
        tables=[
            Table(
                ddl="CREATE TABLE comments (id TEXT, body TEXT, author TEXT)",
                glob="comments/**/index.jsonl",
                extract=lambda path, content: [
                    {
                        "id": os.path.basename(os.path.dirname(path)),
                        "body": row["body"],
                        "author": row["author"],
                    }
                    for line in content.splitlines()
                    for row in [json.loads(line)]
                ],
            ),
        ],
    )
    await db.ready()

    # Query with SQL
    results = await db.query("SELECT * FROM comments WHERE author = 'alice'")
    # [{"id": "abc", "body": "looks good", "author": "alice"}]

asyncio.run(main())

Multiple Tables and Joins

db = DirSQL(
    root,
    tables=[
        Table(
            ddl="CREATE TABLE posts (title TEXT, author_id TEXT)",
            glob="posts/*.json",
            extract=lambda path, content: [json.loads(content)],
        ),
        Table(
            ddl="CREATE TABLE authors (id TEXT, name TEXT)",
            glob="authors/*.json",
            extract=lambda path, content: [json.loads(content)],
        ),
    ],
)
await db.ready()

results = await db.query("""
    SELECT posts.title, authors.name
    FROM posts JOIN authors ON posts.author_id = authors.id
""")

Ignoring Files

Pass ignore patterns to skip files during scanning and watching:

db = DirSQL(
    root,
    ignore=["**/drafts/**", "**/.git/**"],
    tables=[...],
)

Watching for Changes

DirSQL is async by default. The watch() method returns an async iterator of row-level change events.

import asyncio
import json
from dirsql import DirSQL, Table

async def main():
    db = DirSQL(
        "/path/to/data",
        tables=[
            Table(
                ddl="CREATE TABLE items (name TEXT)",
                glob="**/*.json",
                extract=lambda path, content: [json.loads(content)],
            ),
        ],
    )
    await db.ready()

    # Query
    results = await db.query("SELECT * FROM items")

    # Watch for file changes (insert/update/delete/error events)
    async for event in db.watch():
        print(f"{event.action} on {event.table}: {event.row}")
        if event.action == "error":
            print(f"  error: {event.error}")

asyncio.run(main())

API Reference

Table(*, ddl, glob, extract)

Defines how files map to a SQL table.

  • ddl (str): A CREATE TABLE statement defining the schema.
  • glob (str): A glob pattern matched against file paths relative to root.
  • extract (Callable[[str, str], list[dict]]): A function receiving (relative_path, file_content) and returning a list of row dicts. Each dict's keys must match the DDL column names.

DirSQL(root=None, *, tables=None, ignore=None, config=None)

Creates an in-memory SQLite database indexed from the directory at root. The constructor is sync and returns immediately; scanning runs in a background thread.

At least one of root or config must be supplied. When both root and config are passed (or config declares [dirsql].root), the explicit root wins and a warning is emitted on stderr.

  • root (str | None): Path to the directory to index. Optional when config supplies one.
  • tables (list[Table] | None): Programmatic table definitions. Appended to any tables in the config file.
  • ignore (list[str] | None): Glob patterns for paths to skip. Appended to any [dirsql].ignore patterns in the config file.
  • config (str | None): Optional path to a .dirsql.toml file. Its [[table]] entries, [dirsql].ignore, and optional [dirsql].root are merged into the constructor's inputs.

await DirSQL.ready()

Wait for the initial scan to complete. Idempotent -- safe to call multiple times. Raises any exception that occurred during init.

await DirSQL.query(sql) -> list[dict]

Execute a SQL query. Returns a list of dicts keyed by column name. Internal tracking columns (_dirsql_*) are excluded from results.

DirSQL.watch() -> AsyncIterator[RowEvent]

Returns an async iterator that yields RowEvent objects as files change on disk. Starts the filesystem watcher on first iteration.

RowEvent

Emitted by watch() when a file change produces row-level diffs.

  • table (str): The affected table name.
  • action (str): One of "insert", "update", "delete", "error".
  • row (dict | None): The new row (for insert/update) or deleted row (for delete).
  • old_row (dict | None): The previous row (for update only).
  • error (str | None): Error message (for error events).
  • file_path (str | None): The relative file path that triggered the event.

How It Works

The Rust core (rusqlite + notify + walkdir) does the heavy lifting:

  1. Startup scan: Walks the directory tree, matches files to tables via glob patterns, calls the user-provided extract function for each file, and inserts rows into an in-memory SQLite database.
  2. File watching: Uses the notify crate (inotify on Linux, FSEvents on macOS) to detect file creates, modifications, and deletions.
  3. Row diffing: When a file changes, the new rows are diffed against the previous rows for that file, producing granular insert/update/delete events.
  4. Python bindings: PyO3 exposes the Rust core as a native Python extension module. The async layer runs blocking operations in a thread pool via asyncio.to_thread.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dirsql-0.2.6.tar.gz (108.0 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

dirsql-0.2.6-cp313-cp313-win_amd64.whl (5.2 MB view details)

Uploaded CPython 3.13Windows x86-64

dirsql-0.2.6-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (6.1 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ x86-64

dirsql-0.2.6-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (6.1 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ ARM64

dirsql-0.2.6-cp313-cp313-macosx_11_0_arm64.whl (5.5 MB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

dirsql-0.2.6-cp313-cp313-macosx_10_12_x86_64.whl (5.7 MB view details)

Uploaded CPython 3.13macOS 10.12+ x86-64

dirsql-0.2.6-cp312-cp312-win_amd64.whl (5.2 MB view details)

Uploaded CPython 3.12Windows x86-64

dirsql-0.2.6-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (6.2 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

dirsql-0.2.6-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (6.1 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ ARM64

dirsql-0.2.6-cp312-cp312-macosx_11_0_arm64.whl (5.5 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

dirsql-0.2.6-cp312-cp312-macosx_10_12_x86_64.whl (5.7 MB view details)

Uploaded CPython 3.12macOS 10.12+ x86-64

dirsql-0.2.6-cp311-cp311-win_amd64.whl (5.2 MB view details)

Uploaded CPython 3.11Windows x86-64

dirsql-0.2.6-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (6.2 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

dirsql-0.2.6-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (6.1 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ ARM64

dirsql-0.2.6-cp311-cp311-macosx_11_0_arm64.whl (5.5 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

dirsql-0.2.6-cp311-cp311-macosx_10_12_x86_64.whl (5.7 MB view details)

Uploaded CPython 3.11macOS 10.12+ x86-64

dirsql-0.2.6-cp310-cp310-win_amd64.whl (5.2 MB view details)

Uploaded CPython 3.10Windows x86-64

dirsql-0.2.6-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (6.2 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

dirsql-0.2.6-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (6.1 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ ARM64

dirsql-0.2.6-cp310-cp310-macosx_11_0_arm64.whl (5.5 MB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

dirsql-0.2.6-cp310-cp310-macosx_10_12_x86_64.whl (5.7 MB view details)

Uploaded CPython 3.10macOS 10.12+ x86-64

File details

Details for the file dirsql-0.2.6.tar.gz.

File metadata

  • Download URL: dirsql-0.2.6.tar.gz
  • Upload date:
  • Size: 108.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dirsql-0.2.6.tar.gz
Algorithm Hash digest
SHA256 91247aebb2b7a8d3e1d2b772c34f7581c63f4dcf6745a887bd78fd2f2916dcbe
MD5 7921fdce6587b3db63264ea47f78bf01
BLAKE2b-256 9ac85e6d2ddde7e6fd3152576327e534e72376294ea69196c6cdf5bca3db17f5

See more details on using hashes here.

File details

Details for the file dirsql-0.2.6-cp313-cp313-win_amd64.whl.

File metadata

  • Download URL: dirsql-0.2.6-cp313-cp313-win_amd64.whl
  • Upload date:
  • Size: 5.2 MB
  • Tags: CPython 3.13, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dirsql-0.2.6-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 bf7884155be89b1460401f37885f38e3ade3812b1e590571e90f42abbbe4107b
MD5 43125a361601ff585095c1898b990b34
BLAKE2b-256 b6c70af12d025e1f647c887518f2d4154bf61819e5cabb4c48c388380fc18566

See more details on using hashes here.

File details

Details for the file dirsql-0.2.6-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dirsql-0.2.6-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 b0b3d83f72cb98d5d24225dde24118acc365ad7295b2ef70c6bce8d65a19b646
MD5 05d7f10f7117495bbc95081d806f1cd3
BLAKE2b-256 222d776339e18bca62f38644f7ed0c2cdb04f1fe65541983b11648b4b80f2f70

See more details on using hashes here.

File details

Details for the file dirsql-0.2.6-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for dirsql-0.2.6-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 fbd9dc244b10e058013603ddc6ffd8122972743e450a9a0f31fd19db8735050f
MD5 e192f79188f32d752d40ff103c9447ae
BLAKE2b-256 7e6cc8882ca7887b981a0eb0607bdfe71369e6544904824a589e44db9d0b945e

See more details on using hashes here.

File details

Details for the file dirsql-0.2.6-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dirsql-0.2.6-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 6f5d2c11addffc6398679c3c0641ae3d9bc232bc25c0d3dc52aaae3526530f66
MD5 2cd70fd7b8a4f1e4f023c908ad19fc04
BLAKE2b-256 0f18a66b8b50c50a8038f9a0b668ae394d09a6007a1f3fbf24405feed1b95e76

See more details on using hashes here.

File details

Details for the file dirsql-0.2.6-cp313-cp313-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for dirsql-0.2.6-cp313-cp313-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 76d618fff8acdaa1e1eab04b798d7dd82965911e831f9e2214341b72e4bcb689
MD5 9fd8b0afd5e9564398c111cba24da992
BLAKE2b-256 a000d4def5e2985a83b1720844017c3796876285b5ccc8b2482a48e0f7dd8058

See more details on using hashes here.

File details

Details for the file dirsql-0.2.6-cp312-cp312-win_amd64.whl.

File metadata

  • Download URL: dirsql-0.2.6-cp312-cp312-win_amd64.whl
  • Upload date:
  • Size: 5.2 MB
  • Tags: CPython 3.12, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dirsql-0.2.6-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 4511476466bde27ae969124c43145f6b399afc197e4756ef26945134331759a6
MD5 352fa960df3e56a3176c840a6c316169
BLAKE2b-256 f66320a677c1d5ae3a0d44de7d70f1d4dc617a0ccc6568101ea2a27e3eef32be

See more details on using hashes here.

File details

Details for the file dirsql-0.2.6-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dirsql-0.2.6-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 0332b5999be5e195102a13590284718386d7214339eca0954c06efd98d82b673
MD5 37e69b503570d1fcba39f6ef5d27784a
BLAKE2b-256 c9a834c04c687bb219e82da78bea9e5472fd34ac78fae8e35fe6599f047882a1

See more details on using hashes here.

File details

Details for the file dirsql-0.2.6-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for dirsql-0.2.6-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 309317364794cbb15833d24ad7294b1627f1b505f9b5865812734b9576b1fadb
MD5 0bae41cefbc3b86858e6dd0bbc344c3a
BLAKE2b-256 fc8dc73ac3dda2b8b966a6487a892b8999794de20e91f4a86476d5ab3dfdca87

See more details on using hashes here.

File details

Details for the file dirsql-0.2.6-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dirsql-0.2.6-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 ab8a14d43ae943b7e14e4dfb84297f572d482cad3cbf6289ebd17e2d233036ac
MD5 1159208216a69695e7a19e34c475490f
BLAKE2b-256 521d48db82ff2d0e038391af8930a5b2c03008a489769e96c251e4f31a9d5d19

See more details on using hashes here.

File details

Details for the file dirsql-0.2.6-cp312-cp312-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for dirsql-0.2.6-cp312-cp312-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 d6262756f27568299ca82bd44aa06de66b48f43f72033a974243029d9cf3603a
MD5 1d4b320646098442895c1c05b88dda4f
BLAKE2b-256 43cd966e9fe9abea3a161674d813202d1701abdc1160f4d6afd1086d93d69d82

See more details on using hashes here.

File details

Details for the file dirsql-0.2.6-cp311-cp311-win_amd64.whl.

File metadata

  • Download URL: dirsql-0.2.6-cp311-cp311-win_amd64.whl
  • Upload date:
  • Size: 5.2 MB
  • Tags: CPython 3.11, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dirsql-0.2.6-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 a9ffff30a43a007d5d06cddc2f9fc289fcd409b67c06bac64d22198a5cf9718b
MD5 ddd131ad23ad4a8d77bb009216a51626
BLAKE2b-256 8cec8181963fb62f87177f78a9f44bed9e4ed7d51da24013eaeb4668f21ef931

See more details on using hashes here.

File details

Details for the file dirsql-0.2.6-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dirsql-0.2.6-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 985e1cd5af16f27055e73dc4707960910bbf7b0a242a5c339bbc35da3a49293f
MD5 000177b4d174f0e2265213bf517bd7f8
BLAKE2b-256 f299249b2a6a2482d3cbe5d2645d8adfa27ccf17df8cc7f43287b6d0a293fefd

See more details on using hashes here.

File details

Details for the file dirsql-0.2.6-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for dirsql-0.2.6-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 901c58ad6b80c514e491366f286a58ef55bd7eed12e01776cdbb82e089f3a059
MD5 10554a9711914d8dd7eea538b1f902aa
BLAKE2b-256 a26919bc07690d4e1a2db18031363a71a27dceb2c60da54a482e67cdd0fbbab9

See more details on using hashes here.

File details

Details for the file dirsql-0.2.6-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dirsql-0.2.6-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 aaaf4d0bcc0598fcec0bbcb2e409a492629a09d112aee0c143faa5180f2b8445
MD5 ab73a3a1bcbf41ac7d1a7c3fc5fdde2f
BLAKE2b-256 46c371f9c939bbd922440df1225d41170dfd106886bb4a1341f91f6649454b23

See more details on using hashes here.

File details

Details for the file dirsql-0.2.6-cp311-cp311-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for dirsql-0.2.6-cp311-cp311-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 34f44a1be4f9f76256689ac32b5b30b3457e07698bca28f58f9e8d4b4cb46cb6
MD5 a078e6540909cb5bbe829bc4321d1b0d
BLAKE2b-256 df8c49ad4b120c4824503436351fbf5ad2a8fa3a129fa1278f28ffdcbe191b51

See more details on using hashes here.

File details

Details for the file dirsql-0.2.6-cp310-cp310-win_amd64.whl.

File metadata

  • Download URL: dirsql-0.2.6-cp310-cp310-win_amd64.whl
  • Upload date:
  • Size: 5.2 MB
  • Tags: CPython 3.10, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dirsql-0.2.6-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 e232aa494ac33d728f23bf60b9d92745c4c076cece87f901491e1127a191b439
MD5 43e495ca191c06143bd2e63f99446ee9
BLAKE2b-256 fe39134c840d4fc209ac2f50a4bb8c0d0ed1666fa5ede08fd36b31d2ffd12701

See more details on using hashes here.

File details

Details for the file dirsql-0.2.6-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dirsql-0.2.6-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 fcf107cfc6524a23574b06ae0c9704fee0705242c31e7b38be375d0cd9a8580c
MD5 8cf15581c93026c4a38c972d95118139
BLAKE2b-256 151de8f9038e5d207c2d627b4387f4d5a680e32bbb07d03b7c804eb920f2e52b

See more details on using hashes here.

File details

Details for the file dirsql-0.2.6-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for dirsql-0.2.6-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 575cca8a8fce9eeebf11ce7da6823a967ef33b1f0bea3251f2b1a3126bbe0db1
MD5 9e82fa7ab54bb88f13f8839ddc01fbb2
BLAKE2b-256 006b0078cfa038b73ef9bcb7d966fe9c18557c6d4463cc7df520f27420041604

See more details on using hashes here.

File details

Details for the file dirsql-0.2.6-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dirsql-0.2.6-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 736cd0e5957f5337404c722c3496e84e9c38f4f13820707479b64f190600ec62
MD5 072bda5cbe7e0227ee3800e55dfb4035
BLAKE2b-256 6c15300be5a79081fdc0ec8a1a4de07101e4933f0950ef3406ce30ecb11a4d77

See more details on using hashes here.

File details

Details for the file dirsql-0.2.6-cp310-cp310-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for dirsql-0.2.6-cp310-cp310-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 4716daea6912afd402cca99cfb0494cf9274c987e5a31a20a60e4c2d73d771ff
MD5 1cdc658ac718677fa1eee685e4a94e5d
BLAKE2b-256 2e21b37c5f02231cc45368ff0b813dc4827c5198a2568d22a0b45837021474a0

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page