Skip to main content

Ephemeral SQL index over a local directory

Project description

dirsql (Python SDK)

Ephemeral SQL index over a local directory. Watches a filesystem, ingests structured files into an in-memory SQLite database, and exposes a SQL query interface. The database is purely in-memory -- the filesystem is always the source of truth.

Documentation

Installation

pip install dirsql

Requires Python >= 3.12. Ships as a native extension (Rust via PyO3) -- binary wheels are provided for common platforms.

Each wheel also bundles the dirsql HTTP-server CLI as a console script, so pip install dirsql also gives you a dirsql command on $PATH. See the CLI guide.

Publishing (maintainers)

Handled by .github/workflows/publish.yml (invoked from minor-release.yml / patch-release.yml). For each target triple the build job cargo builds the Rust CLI with --features cli, stages the binary into python/dirsql/_binary/, runs maturin build (which picks the binary up via the [tool.maturin] include rule in pyproject.toml), and the wheels + sdist are then trusted-published to PyPI.

Quick Start

import asyncio
import json
import os
import tempfile
from dirsql import DirSQL, Table

async def main():
    # Create some data files
    root = tempfile.mkdtemp()
    os.makedirs(os.path.join(root, "comments", "abc"), exist_ok=True)
    os.makedirs(os.path.join(root, "comments", "def"), exist_ok=True)

    with open(os.path.join(root, "comments", "abc", "index.jsonl"), "w") as f:
        f.write(json.dumps({"body": "looks good", "author": "alice"}) + "\n")
        f.write(json.dumps({"body": "needs work", "author": "bob"}) + "\n")

    with open(os.path.join(root, "comments", "def", "index.jsonl"), "w") as f:
        f.write(json.dumps({"body": "agreed", "author": "carol"}) + "\n")

    # Define a table: DDL, glob pattern, and an extract function
    db = DirSQL(
        root,
        tables=[
            Table(
                ddl="CREATE TABLE comments (id TEXT, body TEXT, author TEXT)",
                glob="comments/**/index.jsonl",
                extract=lambda path, content: [
                    {
                        "id": os.path.basename(os.path.dirname(path)),
                        "body": row["body"],
                        "author": row["author"],
                    }
                    for line in content.splitlines()
                    for row in [json.loads(line)]
                ],
            ),
        ],
    )
    await db.ready()

    # Query with SQL
    results = await db.query("SELECT * FROM comments WHERE author = 'alice'")
    # [{"id": "abc", "body": "looks good", "author": "alice"}]

asyncio.run(main())

Multiple Tables and Joins

db = DirSQL(
    root,
    tables=[
        Table(
            ddl="CREATE TABLE posts (title TEXT, author_id TEXT)",
            glob="posts/*.json",
            extract=lambda path, content: [json.loads(content)],
        ),
        Table(
            ddl="CREATE TABLE authors (id TEXT, name TEXT)",
            glob="authors/*.json",
            extract=lambda path, content: [json.loads(content)],
        ),
    ],
)
await db.ready()

results = await db.query("""
    SELECT posts.title, authors.name
    FROM posts JOIN authors ON posts.author_id = authors.id
""")

Ignoring Files

Pass ignore patterns to skip files during scanning and watching:

db = DirSQL(
    root,
    ignore=["**/drafts/**", "**/.git/**"],
    tables=[...],
)

Watching for Changes

DirSQL is async by default. The watch() method returns an async iterator of row-level change events.

import asyncio
import json
from dirsql import DirSQL, Table

async def main():
    db = DirSQL(
        "/path/to/data",
        tables=[
            Table(
                ddl="CREATE TABLE items (name TEXT)",
                glob="**/*.json",
                extract=lambda path, content: [json.loads(content)],
            ),
        ],
    )
    await db.ready()

    # Query
    results = await db.query("SELECT * FROM items")

    # Watch for file changes (insert/update/delete/error events)
    async for event in db.watch():
        print(f"{event.action} on {event.table}: {event.row}")
        if event.action == "error":
            print(f"  error: {event.error}")

asyncio.run(main())

API Reference

Table(*, ddl, glob, extract)

Defines how files map to a SQL table.

  • ddl (str): A CREATE TABLE statement defining the schema.
  • glob (str): A glob pattern matched against file paths relative to root.
  • extract (Callable[[str, str], list[dict]]): A function receiving (relative_path, file_content) and returning a list of row dicts. Each dict's keys must match the DDL column names.

DirSQL(root=None, *, tables=None, ignore=None, config=None)

Creates an in-memory SQLite database indexed from the directory at root. The constructor is sync and returns immediately; scanning runs in a background thread.

At least one of root or config must be supplied. When both root and config are passed (or config declares [dirsql].root), the explicit root wins and a warning is emitted on stderr.

  • root (str | None): Path to the directory to index. Optional when config supplies one.
  • tables (list[Table] | None): Programmatic table definitions. Appended to any tables in the config file.
  • ignore (list[str] | None): Glob patterns for paths to skip. Appended to any [dirsql].ignore patterns in the config file.
  • config (str | None): Optional path to a .dirsql.toml file. Its [[table]] entries, [dirsql].ignore, and optional [dirsql].root are merged into the constructor's inputs.

await DirSQL.ready()

Wait for the initial scan to complete. Idempotent -- safe to call multiple times. Raises any exception that occurred during init.

await DirSQL.query(sql) -> list[dict]

Execute a SQL query. Returns a list of dicts keyed by column name. Internal tracking columns (_dirsql_*) are excluded from results.

DirSQL.watch() -> AsyncIterator[RowEvent]

Returns an async iterator that yields RowEvent objects as files change on disk. Starts the filesystem watcher on first iteration.

RowEvent

Emitted by watch() when a file change produces row-level diffs.

  • table (str): The affected table name.
  • action (str): One of "insert", "update", "delete", "error".
  • row (dict | None): The new row (for insert/update) or deleted row (for delete).
  • old_row (dict | None): The previous row (for update only).
  • error (str | None): Error message (for error events).
  • file_path (str | None): The relative file path that triggered the event.

How It Works

The Rust core (rusqlite + notify + walkdir) does the heavy lifting:

  1. Startup scan: Walks the directory tree, matches files to tables via glob patterns, calls the user-provided extract function for each file, and inserts rows into an in-memory SQLite database.
  2. File watching: Uses the notify crate (inotify on Linux, FSEvents on macOS) to detect file creates, modifications, and deletions.
  3. Row diffing: When a file changes, the new rows are diffed against the previous rows for that file, producing granular insert/update/delete events.
  4. Python bindings: PyO3 exposes the Rust core as a native Python extension module. The async layer runs blocking operations in a thread pool via asyncio.to_thread.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dirsql-0.2.7.tar.gz (214.7 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

dirsql-0.2.7-cp313-cp313-win_amd64.whl (5.2 MB view details)

Uploaded CPython 3.13Windows x86-64

dirsql-0.2.7-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (6.1 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ x86-64

dirsql-0.2.7-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (6.1 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ ARM64

dirsql-0.2.7-cp313-cp313-macosx_11_0_arm64.whl (5.5 MB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

dirsql-0.2.7-cp313-cp313-macosx_10_12_x86_64.whl (5.7 MB view details)

Uploaded CPython 3.13macOS 10.12+ x86-64

dirsql-0.2.7-cp312-cp312-win_amd64.whl (5.2 MB view details)

Uploaded CPython 3.12Windows x86-64

dirsql-0.2.7-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (6.2 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

dirsql-0.2.7-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (6.1 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ ARM64

dirsql-0.2.7-cp312-cp312-macosx_11_0_arm64.whl (5.5 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

dirsql-0.2.7-cp312-cp312-macosx_10_12_x86_64.whl (5.7 MB view details)

Uploaded CPython 3.12macOS 10.12+ x86-64

dirsql-0.2.7-cp311-cp311-win_amd64.whl (5.2 MB view details)

Uploaded CPython 3.11Windows x86-64

dirsql-0.2.7-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (6.2 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

dirsql-0.2.7-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (6.1 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ ARM64

dirsql-0.2.7-cp311-cp311-macosx_11_0_arm64.whl (5.5 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

dirsql-0.2.7-cp311-cp311-macosx_10_12_x86_64.whl (5.7 MB view details)

Uploaded CPython 3.11macOS 10.12+ x86-64

dirsql-0.2.7-cp310-cp310-win_amd64.whl (5.2 MB view details)

Uploaded CPython 3.10Windows x86-64

dirsql-0.2.7-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (6.2 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

dirsql-0.2.7-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (6.1 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ ARM64

dirsql-0.2.7-cp310-cp310-macosx_11_0_arm64.whl (5.5 MB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

dirsql-0.2.7-cp310-cp310-macosx_10_12_x86_64.whl (5.7 MB view details)

Uploaded CPython 3.10macOS 10.12+ x86-64

File details

Details for the file dirsql-0.2.7.tar.gz.

File metadata

  • Download URL: dirsql-0.2.7.tar.gz
  • Upload date:
  • Size: 214.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dirsql-0.2.7.tar.gz
Algorithm Hash digest
SHA256 058d59b1dc6ff5d986bee7bcb2bddcd2271f6c2ce8a1b7bf7a40a3e04df4c647
MD5 5cfbd1516e7558c56841dee26679bff3
BLAKE2b-256 557470ed6c10893cb18b248e3bbf15f6e5f9507e83b68bc129d44ece964143b8

See more details on using hashes here.

File details

Details for the file dirsql-0.2.7-cp313-cp313-win_amd64.whl.

File metadata

  • Download URL: dirsql-0.2.7-cp313-cp313-win_amd64.whl
  • Upload date:
  • Size: 5.2 MB
  • Tags: CPython 3.13, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dirsql-0.2.7-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 a65aabb77e9919ff8f4406817b877be5f80b9551ef487546030cfd264468f059
MD5 fe3cefaaabef34fee91a33e672af95f0
BLAKE2b-256 b508dce9dc307b6ea9554c6870317fa10617315bce18e7f91752720e69ea7ff8

See more details on using hashes here.

File details

Details for the file dirsql-0.2.7-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dirsql-0.2.7-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 583975ce508e1f1227a145cd89b90d8dc241f794c3ba192d1322e120a8672b68
MD5 ad85ae8d7ea341d49a9d505a23ac7d07
BLAKE2b-256 f2ee480baccac05a7c4f5b4602de61c2428ae26dd9b88d8e86729884cba31afd

See more details on using hashes here.

File details

Details for the file dirsql-0.2.7-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for dirsql-0.2.7-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 49dabd4f649de1cb369e63705e5e2f7e2ab75a3cfeaf9eb05c09244df17a6571
MD5 fdbc8cdbe5e55e4989f695e054b50610
BLAKE2b-256 8bc05ced45e155e4232491478e210b890a1b2804665fa42c8b962d5039cde8a8

See more details on using hashes here.

File details

Details for the file dirsql-0.2.7-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dirsql-0.2.7-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 794d3e16e8227d4961108233c5dbddd6914b4798f5eaf6dbe5aebbc91a288af1
MD5 a5c5de369cb2e14c6b03e63a682ecaa3
BLAKE2b-256 ddbc3d818c60b688d5a31bf8da6c78074465f22d66b905d46afcc9d8219a924f

See more details on using hashes here.

File details

Details for the file dirsql-0.2.7-cp313-cp313-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for dirsql-0.2.7-cp313-cp313-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 d1c291dbec2cef8c1141efe7b2e1f8625597caed17394242ff70c485fd17de15
MD5 5dd1a5d2920320989a119feb0b6edbae
BLAKE2b-256 1009df737d34bd252ab6e88f4a5da3a1fb6467182ad32fd8571f6b623a296627

See more details on using hashes here.

File details

Details for the file dirsql-0.2.7-cp312-cp312-win_amd64.whl.

File metadata

  • Download URL: dirsql-0.2.7-cp312-cp312-win_amd64.whl
  • Upload date:
  • Size: 5.2 MB
  • Tags: CPython 3.12, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dirsql-0.2.7-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 1c53298cec98d9604fd5a4077cf507541364c5d4df54bf09324c6c7105b51f1a
MD5 fd633db8abdbdba223afb854d74531d1
BLAKE2b-256 8b62aa201ade02889151a794de4f26778d72810f83276bfa89e03c5b6efc3e08

See more details on using hashes here.

File details

Details for the file dirsql-0.2.7-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dirsql-0.2.7-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 cc6f6dcbf513ddf14a98ffdebad84c2e937eb967400649b440f20ea14f85fcab
MD5 dfd3aaaf76395c8a6edb3dd8fecccc6f
BLAKE2b-256 d99609db440c15cb658d5013425f3e255a748495dbe2b893ba71674a66837b33

See more details on using hashes here.

File details

Details for the file dirsql-0.2.7-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for dirsql-0.2.7-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 8e5914737f7f30c4171532ff77e5d0697cfbc280a9347298da0d1605523399a5
MD5 738d7b2a68cdd0c680e1a9d67dd96cef
BLAKE2b-256 89d7c282d93ea28fc087ea6d74486e57c46d382a21ab3dea260f9166cb39898f

See more details on using hashes here.

File details

Details for the file dirsql-0.2.7-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dirsql-0.2.7-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 e4463dfe1ecfb64db5e738eff354a50cd4dd5f409eb1cfc96fd34441f16c0611
MD5 8f01df12fe6a207811d1e5ec6c32013e
BLAKE2b-256 7e59f251f9611c127dcc5257544ad1243d1368a24d83e3f1b2b918c69b63ee8e

See more details on using hashes here.

File details

Details for the file dirsql-0.2.7-cp312-cp312-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for dirsql-0.2.7-cp312-cp312-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 3583b2f08cefa39a8758e1145c114300ccd96839250509ea583cc7e4166e84f2
MD5 10119827d2b014e998a2cef2e3aef20e
BLAKE2b-256 179d19c2c4d37227f1df42b98ed5e2205ed3487625d2fdd658c72981922a3894

See more details on using hashes here.

File details

Details for the file dirsql-0.2.7-cp311-cp311-win_amd64.whl.

File metadata

  • Download URL: dirsql-0.2.7-cp311-cp311-win_amd64.whl
  • Upload date:
  • Size: 5.2 MB
  • Tags: CPython 3.11, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dirsql-0.2.7-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 c994be1adfa63175ca94cc1f6bb50217d82734419999fb378021ec5a5ef9da35
MD5 49c4d03f98ee3b5c8d10f1c658c665bc
BLAKE2b-256 963ab14ac14a734f37fa1b26b31cac6833e208f6b529effad1d0f55152d7289e

See more details on using hashes here.

File details

Details for the file dirsql-0.2.7-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dirsql-0.2.7-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 0cee19d7fc9a59a44ba10d17acabdc112bd00c39ff31983c954fe753242765fd
MD5 ace606bd196c088a650583f369748183
BLAKE2b-256 589aeb9de9e17e0fa133866c4f60f67ea33852dee5759e36208d7e2bc23492f0

See more details on using hashes here.

File details

Details for the file dirsql-0.2.7-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for dirsql-0.2.7-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 b7856d812edd5a04598039b2a8b28c47729f2d41a881b6a3a4b42e6481b2a75b
MD5 dcab59484c2da0a8555f51e16ebda2ab
BLAKE2b-256 ff021d7708b0a01dc7a58ec3043a9e56cc9b76174ca17c7c444378f1f6ef4477

See more details on using hashes here.

File details

Details for the file dirsql-0.2.7-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dirsql-0.2.7-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 546b6ae1806c6d7f2fe2cfdf208cd4808ad80382a2044a9ce772c7e01b101543
MD5 ad15110f2a5999e581d2acb487475d65
BLAKE2b-256 178d99c988c931ac6a98bb0de689261b2433848f4c925d01ecb8cf0d02e42af7

See more details on using hashes here.

File details

Details for the file dirsql-0.2.7-cp311-cp311-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for dirsql-0.2.7-cp311-cp311-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 47b7968c913b0a589303c20c8190c0043a284642be20a43f67db386452d182a5
MD5 9f30c549413882bbf8bd1269b14a5beb
BLAKE2b-256 3f4bdbd948a8dcf499a5be4256ff213e91b6260acedcfc7acafa90b39fa19a86

See more details on using hashes here.

File details

Details for the file dirsql-0.2.7-cp310-cp310-win_amd64.whl.

File metadata

  • Download URL: dirsql-0.2.7-cp310-cp310-win_amd64.whl
  • Upload date:
  • Size: 5.2 MB
  • Tags: CPython 3.10, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dirsql-0.2.7-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 a6b2ebf0832a8e41b58a0c5a75f3d435474b5c794e259665b31096af859ef539
MD5 d869c581111e120618d639b504ae2b65
BLAKE2b-256 72a37a647809e1b10db8dda4c2cce3cf3a24ef957fb10e0340480d7d6c928a15

See more details on using hashes here.

File details

Details for the file dirsql-0.2.7-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dirsql-0.2.7-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 f53631d460e7e7be93bf4cf3884e10ed2686584af67fb020881295e8dc033207
MD5 f9abd33caecb280c6be376c257429920
BLAKE2b-256 f362e4e9cf8c529fa41d2ea0cb64d89df9e3323fb67a7449b86462ea19f11f04

See more details on using hashes here.

File details

Details for the file dirsql-0.2.7-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for dirsql-0.2.7-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 8fd64f8786372f78a26d55e95c76baff8521289ec57b0007b94b07f979bd2036
MD5 e9c53ac486d4440a393defefa3ac556e
BLAKE2b-256 1c1aa57a95c92486ece4e0c3ce65c0587fe5c77d59503a9fc446ab3d32373703

See more details on using hashes here.

File details

Details for the file dirsql-0.2.7-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dirsql-0.2.7-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 eec2e81fa13bc772f1a0819538b81455be9260b5315759b0ca9fde3a15d59a53
MD5 8d872ee88c42c0f68c546516d52f6e73
BLAKE2b-256 7058ffa83e451b557c274d684b64c9c4be22ae12d10cc6110e0b4a94770e749d

See more details on using hashes here.

File details

Details for the file dirsql-0.2.7-cp310-cp310-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for dirsql-0.2.7-cp310-cp310-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 eac29f6d6331282df4a8601e0890937d43622241b314cca458246c9bf2d51d04
MD5 6856c690a4a729b5505f560f4838c6ba
BLAKE2b-256 471026840bfdb6e573198a02d677d37116d782254b1226b05118d596f3b03198

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page