Skip to main content

Ephemeral SQL index over a local directory

Project description

dirsql (Python SDK)

Ephemeral SQL index over a local directory. Watches a filesystem, ingests structured files into an in-memory SQLite database, and exposes a SQL query interface. The database is purely in-memory -- the filesystem is always the source of truth.

Documentation

Installation

pip install dirsql

Requires Python >= 3.12. Ships as a native extension (Rust via PyO3) -- binary wheels are provided for common platforms.

Each wheel also bundles the dirsql HTTP-server CLI as a console script, so pip install dirsql also gives you a dirsql command on $PATH. See the CLI guide.

Publishing (maintainers)

Handled by .github/workflows/publish.yml (invoked from minor-release.yml / patch-release.yml). For each target triple the build job cargo builds the Rust CLI with --features cli, stages the binary into python/dirsql/_binary/, runs maturin build (which picks the binary up via the [tool.maturin] include rule in pyproject.toml), and the wheels + sdist are then trusted-published to PyPI.

Quick Start

import asyncio
import json
import os
import tempfile
from dirsql import DirSQL, Table

async def main():
    # Create some data files
    root = tempfile.mkdtemp()
    os.makedirs(os.path.join(root, "comments", "abc"), exist_ok=True)
    os.makedirs(os.path.join(root, "comments", "def"), exist_ok=True)

    with open(os.path.join(root, "comments", "abc", "index.jsonl"), "w") as f:
        f.write(json.dumps({"body": "looks good", "author": "alice"}) + "\n")
        f.write(json.dumps({"body": "needs work", "author": "bob"}) + "\n")

    with open(os.path.join(root, "comments", "def", "index.jsonl"), "w") as f:
        f.write(json.dumps({"body": "agreed", "author": "carol"}) + "\n")

    # Define a table: DDL, glob pattern, and an extract function
    db = DirSQL(
        root,
        tables=[
            Table(
                ddl="CREATE TABLE comments (id TEXT, body TEXT, author TEXT)",
                glob="comments/**/index.jsonl",
                extract=lambda path, content: [
                    {
                        "id": os.path.basename(os.path.dirname(path)),
                        "body": row["body"],
                        "author": row["author"],
                    }
                    for line in content.splitlines()
                    for row in [json.loads(line)]
                ],
            ),
        ],
    )
    await db.ready()

    # Query with SQL
    results = await db.query("SELECT * FROM comments WHERE author = 'alice'")
    # [{"id": "abc", "body": "looks good", "author": "alice"}]

asyncio.run(main())

Multiple Tables and Joins

db = DirSQL(
    root,
    tables=[
        Table(
            ddl="CREATE TABLE posts (title TEXT, author_id TEXT)",
            glob="posts/*.json",
            extract=lambda path, content: [json.loads(content)],
        ),
        Table(
            ddl="CREATE TABLE authors (id TEXT, name TEXT)",
            glob="authors/*.json",
            extract=lambda path, content: [json.loads(content)],
        ),
    ],
)
await db.ready()

results = await db.query("""
    SELECT posts.title, authors.name
    FROM posts JOIN authors ON posts.author_id = authors.id
""")

Ignoring Files

Pass ignore patterns to skip files during scanning and watching:

db = DirSQL(
    root,
    ignore=["**/drafts/**", "**/.git/**"],
    tables=[...],
)

Watching for Changes

DirSQL is async by default. The watch() method returns an async iterator of row-level change events.

import asyncio
import json
from dirsql import DirSQL, Table

async def main():
    db = DirSQL(
        "/path/to/data",
        tables=[
            Table(
                ddl="CREATE TABLE items (name TEXT)",
                glob="**/*.json",
                extract=lambda path, content: [json.loads(content)],
            ),
        ],
    )
    await db.ready()

    # Query
    results = await db.query("SELECT * FROM items")

    # Watch for file changes (insert/update/delete/error events)
    async for event in db.watch():
        print(f"{event.action} on {event.table}: {event.row}")
        if event.action == "error":
            print(f"  error: {event.error}")

asyncio.run(main())

API Reference

Table(*, ddl, glob, extract)

Defines how files map to a SQL table.

  • ddl (str): A CREATE TABLE statement defining the schema.
  • glob (str): A glob pattern matched against file paths relative to root.
  • extract (Callable[[str, str], list[dict]]): A function receiving (relative_path, file_content) and returning a list of row dicts. Each dict's keys must match the DDL column names.

DirSQL(root=None, *, tables=None, ignore=None, config=None)

Creates an in-memory SQLite database indexed from the directory at root. The constructor is sync and returns immediately; scanning runs in a background thread.

At least one of root or config must be supplied. When both root and config are passed (or config declares [dirsql].root), the explicit root wins and a warning is emitted on stderr.

  • root (str | None): Path to the directory to index. Optional when config supplies one.
  • tables (list[Table] | None): Programmatic table definitions. Appended to any tables in the config file.
  • ignore (list[str] | None): Glob patterns for paths to skip. Appended to any [dirsql].ignore patterns in the config file.
  • config (str | None): Optional path to a .dirsql.toml file. Its [[table]] entries, [dirsql].ignore, and optional [dirsql].root are merged into the constructor's inputs.

await DirSQL.ready()

Wait for the initial scan to complete. Idempotent -- safe to call multiple times. Raises any exception that occurred during init.

await DirSQL.query(sql) -> list[dict]

Execute a SQL query. Returns a list of dicts keyed by column name. Internal tracking columns (_dirsql_*) are excluded from results.

DirSQL.watch() -> AsyncIterator[RowEvent]

Returns an async iterator that yields RowEvent objects as files change on disk. Starts the filesystem watcher on first iteration.

RowEvent

Emitted by watch() when a file change produces row-level diffs.

  • table (str): The affected table name.
  • action (str): One of "insert", "update", "delete", "error".
  • row (dict | None): The new row (for insert/update) or deleted row (for delete).
  • old_row (dict | None): The previous row (for update only).
  • error (str | None): Error message (for error events).
  • file_path (str | None): The relative file path that triggered the event.

How It Works

The Rust core (rusqlite + notify + walkdir) does the heavy lifting:

  1. Startup scan: Walks the directory tree, matches files to tables via glob patterns, calls the user-provided extract function for each file, and inserts rows into an in-memory SQLite database.
  2. File watching: Uses the notify crate (inotify on Linux, FSEvents on macOS) to detect file creates, modifications, and deletions.
  3. Row diffing: When a file changes, the new rows are diffed against the previous rows for that file, producing granular insert/update/delete events.
  4. Python bindings: PyO3 exposes the Rust core as a native Python extension module. The async layer runs blocking operations in a thread pool via asyncio.to_thread.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dirsql-0.2.5.tar.gz (108.0 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

dirsql-0.2.5-cp313-cp313-win_amd64.whl (5.2 MB view details)

Uploaded CPython 3.13Windows x86-64

dirsql-0.2.5-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (6.1 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ x86-64

dirsql-0.2.5-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (6.1 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ ARM64

dirsql-0.2.5-cp313-cp313-macosx_11_0_arm64.whl (5.5 MB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

dirsql-0.2.5-cp313-cp313-macosx_10_12_x86_64.whl (5.7 MB view details)

Uploaded CPython 3.13macOS 10.12+ x86-64

dirsql-0.2.5-cp312-cp312-win_amd64.whl (5.2 MB view details)

Uploaded CPython 3.12Windows x86-64

dirsql-0.2.5-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (6.2 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

dirsql-0.2.5-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (6.1 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ ARM64

dirsql-0.2.5-cp312-cp312-macosx_11_0_arm64.whl (5.5 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

dirsql-0.2.5-cp312-cp312-macosx_10_12_x86_64.whl (5.7 MB view details)

Uploaded CPython 3.12macOS 10.12+ x86-64

dirsql-0.2.5-cp311-cp311-win_amd64.whl (5.2 MB view details)

Uploaded CPython 3.11Windows x86-64

dirsql-0.2.5-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (6.2 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

dirsql-0.2.5-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (6.1 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ ARM64

dirsql-0.2.5-cp311-cp311-macosx_11_0_arm64.whl (5.5 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

dirsql-0.2.5-cp311-cp311-macosx_10_12_x86_64.whl (5.7 MB view details)

Uploaded CPython 3.11macOS 10.12+ x86-64

dirsql-0.2.5-cp310-cp310-win_amd64.whl (5.2 MB view details)

Uploaded CPython 3.10Windows x86-64

dirsql-0.2.5-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (6.2 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

dirsql-0.2.5-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (6.1 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ ARM64

dirsql-0.2.5-cp310-cp310-macosx_11_0_arm64.whl (5.5 MB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

dirsql-0.2.5-cp310-cp310-macosx_10_12_x86_64.whl (5.7 MB view details)

Uploaded CPython 3.10macOS 10.12+ x86-64

File details

Details for the file dirsql-0.2.5.tar.gz.

File metadata

  • Download URL: dirsql-0.2.5.tar.gz
  • Upload date:
  • Size: 108.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dirsql-0.2.5.tar.gz
Algorithm Hash digest
SHA256 0972b7b2db96650122a94d06deae640430bacca6159d580e39f86214767c9bd1
MD5 8b07d512260bcfe349741883b7b2148b
BLAKE2b-256 a6279931c6dacee27ae165fe50b37093dbda1ff78aeaa36d8cc3b374bea7611d

See more details on using hashes here.

File details

Details for the file dirsql-0.2.5-cp313-cp313-win_amd64.whl.

File metadata

  • Download URL: dirsql-0.2.5-cp313-cp313-win_amd64.whl
  • Upload date:
  • Size: 5.2 MB
  • Tags: CPython 3.13, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dirsql-0.2.5-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 7ff9b9b02e3ba2cd58338358bbcffefaac4749c32ef3de594a3ea4ca4d2685ff
MD5 5b26a8e48ba124fc0c97ffad49185148
BLAKE2b-256 2886b1f4eebf68dfe0a2d68fcde9da4aec919affd2d0979c82cec88022e5f968

See more details on using hashes here.

File details

Details for the file dirsql-0.2.5-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dirsql-0.2.5-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 45ac2fe87c66da771f229d299a19552845e20ec0513d0e9693b1afab1b3ddcdf
MD5 5e9559da24e764191f029b499e458112
BLAKE2b-256 4bb682c9783bf7152b708a98f8311ff0b8ab9eb70275b338954f57793b768e18

See more details on using hashes here.

File details

Details for the file dirsql-0.2.5-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for dirsql-0.2.5-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 a4f4920c491ad1a3608bfc85fff84a958e0a98fa12aa4450e3911412038b5885
MD5 53a7e641d238ece2db35ffe8224666b4
BLAKE2b-256 41871d38ade433fe6d348b32c3dd01f58c73b9b0263725ee5a16036fa0d8e892

See more details on using hashes here.

File details

Details for the file dirsql-0.2.5-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dirsql-0.2.5-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 be45bfd0471686605ec9664b7fdfbc18ea05070f3922b71f06d8e2bb7bbccc3e
MD5 7ee6f81ecc976b6b15ed068f518f74b0
BLAKE2b-256 1f4e220e23df5ef175dc4eee9f583e714f38665e55edb4e6ce0c95282f75fe59

See more details on using hashes here.

File details

Details for the file dirsql-0.2.5-cp313-cp313-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for dirsql-0.2.5-cp313-cp313-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 cc0661fa8860226fe805a2566ea2fe9b8627934f9521eb33a81985be68b09ea7
MD5 ec693596498e8957a4fa5e9685b115b0
BLAKE2b-256 a2bb1f174e0d265acc7c03b792ce476edc7b49e715945af7ebad8c694ebc99aa

See more details on using hashes here.

File details

Details for the file dirsql-0.2.5-cp312-cp312-win_amd64.whl.

File metadata

  • Download URL: dirsql-0.2.5-cp312-cp312-win_amd64.whl
  • Upload date:
  • Size: 5.2 MB
  • Tags: CPython 3.12, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dirsql-0.2.5-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 da5ca9307c83d293637dc5a85604eab2c1c4a21a8db2189ec6e702f6cfd2abf6
MD5 c517717808d993abe64a4fc603679386
BLAKE2b-256 30f25747abc22b3fec62bee235d549502760bbd41af361bea991b84a3be004f4

See more details on using hashes here.

File details

Details for the file dirsql-0.2.5-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dirsql-0.2.5-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 fcce53ca1269cd74eaa9b79fef1d330250f7a5ac914fce226ef427cd0d347e47
MD5 51ead02b76445d0aa63a0a2614d1c467
BLAKE2b-256 8ea18acc306a5b5691dbb732b64c560e9ed7b248a28c99af8d864811631bdd08

See more details on using hashes here.

File details

Details for the file dirsql-0.2.5-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for dirsql-0.2.5-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 780fcba59ec0234667472fa72f23852352e5b3470e7a274ffdda31d296f6348f
MD5 4d1c79671c7b69e14a5a6f28c2501a93
BLAKE2b-256 1caa4b19b4b27fe29835f274406ad91956e838e628e48396e8fd6ee0dbea2608

See more details on using hashes here.

File details

Details for the file dirsql-0.2.5-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dirsql-0.2.5-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 6887fb7049fab79c14f5f6b80aeaa8cd742fd412b1f3e76216f8d2dea90ab606
MD5 0c15c41fd85c3752605b0bffd4334407
BLAKE2b-256 adbbea80be01f4199580c8131ce81bee3ebecd0fb8c4da40b257de3c6d16f711

See more details on using hashes here.

File details

Details for the file dirsql-0.2.5-cp312-cp312-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for dirsql-0.2.5-cp312-cp312-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 426a25bd2808619faaaf198faf33d567432de3669634fba1e3887ba6ea127c28
MD5 6c85ebf40c4f5d96ac65e7720bc65fe5
BLAKE2b-256 8615a002e8cc79ee4c8b3414406637d0d84d810f2cc5fda6d43f6a4176551b73

See more details on using hashes here.

File details

Details for the file dirsql-0.2.5-cp311-cp311-win_amd64.whl.

File metadata

  • Download URL: dirsql-0.2.5-cp311-cp311-win_amd64.whl
  • Upload date:
  • Size: 5.2 MB
  • Tags: CPython 3.11, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dirsql-0.2.5-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 618c40dcb7d4a3e4a0a265786d7781693306e0799ac638f2d261c175e18d6afc
MD5 90b919e9608e9f4c715a1b2255ac7448
BLAKE2b-256 250dd3e9462e812fa89ddd6d39c9bedfdcbd639e02dd816a3dc520fc3f2afea3

See more details on using hashes here.

File details

Details for the file dirsql-0.2.5-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dirsql-0.2.5-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 3c67fa28d6a1d5f0a1f4c6babc73d1498a0a368ac22bb8ea6d02491b9019a977
MD5 d8080923072a61843eb8a8b3d479b45a
BLAKE2b-256 248029de2c4a485dfe0198d99df2bdef1f4b8a3b573a6c4423bccb7cc84ea571

See more details on using hashes here.

File details

Details for the file dirsql-0.2.5-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for dirsql-0.2.5-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 e980cc0d119a4697bfb848c020cc14c259baebf271d337af3863e3491aed0025
MD5 b5e433f66273c59b3ca61362ccdcda59
BLAKE2b-256 e02897c1b9865d198cc5ef51990f1e6f52364fb8f4657e2b7cf621290da6c11c

See more details on using hashes here.

File details

Details for the file dirsql-0.2.5-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dirsql-0.2.5-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 bbed7a730fdb74458a75ba79d57ae480f9a2e6ded760df0dfc6aa656e0839337
MD5 ffa5c82e1bb8b8983da4e9f24972521f
BLAKE2b-256 6713de1ab8ee1f2f2b4314b502f87e5334477cae11896291e063335a95c9a69c

See more details on using hashes here.

File details

Details for the file dirsql-0.2.5-cp311-cp311-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for dirsql-0.2.5-cp311-cp311-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 3797c02fab971e0b7a2fd707c5b863499f8b6788c50ee88ae09187c48b6022f6
MD5 9c8eb880c7e0ce69abc6db58966b7d41
BLAKE2b-256 3b5761fdbd1eed6cdc46cc9cfe6c3755d4877ae1f8be3707e8db992168f96f72

See more details on using hashes here.

File details

Details for the file dirsql-0.2.5-cp310-cp310-win_amd64.whl.

File metadata

  • Download URL: dirsql-0.2.5-cp310-cp310-win_amd64.whl
  • Upload date:
  • Size: 5.2 MB
  • Tags: CPython 3.10, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dirsql-0.2.5-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 459c540d99e077a5e06b96ecea0c0912cbc6e1463ad3a854619642c1618b1c77
MD5 9f225e5dacdf40a0e8a8e3160a9c3fea
BLAKE2b-256 b4cd4449b86b6b9845b7e350dade18f0c470de14eaeb62c7ff2e46d7af37b7e1

See more details on using hashes here.

File details

Details for the file dirsql-0.2.5-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dirsql-0.2.5-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 0407871d01a9ce0bb9e50015cfef44b77b93dde472e8a310dc9eee8cba61139b
MD5 3346c3fa78ce793dcbce38c5fb5e4f7a
BLAKE2b-256 3d152c1c24d33971112a217a222a19abab045b7093366e7a51abced35966ceef

See more details on using hashes here.

File details

Details for the file dirsql-0.2.5-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for dirsql-0.2.5-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 15d0fb3cf839f6da809ae67fc3d530839b549f9ca6d4019eb66b4a6f4d3fe65a
MD5 84dcaa1e214ff478826008bbe39949d8
BLAKE2b-256 286dde9522b64e89f5e50f5f0bc50a26cd9901e32149e8a28404b6c7e0dab067

See more details on using hashes here.

File details

Details for the file dirsql-0.2.5-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dirsql-0.2.5-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 622030dc080e90ccb978938a1d20cf434ccf3ccd7384edb2b090bb30848e1275
MD5 cfd386700fb7ee515032261e883efe65
BLAKE2b-256 837034c2138ecf8d5c94721f22587abc1263f48ba12db8987203eb1a9a30692f

See more details on using hashes here.

File details

Details for the file dirsql-0.2.5-cp310-cp310-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for dirsql-0.2.5-cp310-cp310-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 57cf82d24b698329743aacb85822d3d8b02710d08ce20b2727a22b873d3d36dc
MD5 cdc8378261712bcc5e36806915c80cb6
BLAKE2b-256 1217e5d2ffb1cc4c0ecb3576dbcc67e724788e1b5ce21e1cc74cf509fb933ec7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page