Skip to main content

Ephemeral SQL index over a local directory

Project description

dirsql (Python SDK)

Ephemeral SQL index over a local directory. Watches a filesystem, ingests structured files into an in-memory SQLite database, and exposes a SQL query interface. The database is purely in-memory -- the filesystem is always the source of truth.

Documentation

Installation

pip install dirsql

Requires Python >= 3.12. Ships as a native extension (Rust via PyO3) -- binary wheels are provided for common platforms.

Each wheel also bundles the dirsql HTTP-server CLI as a console script, so pip install dirsql also gives you a dirsql command on $PATH. See the CLI guide.

Publishing (maintainers)

Handled by .github/workflows/publish.yml (invoked from minor-release.yml / patch-release.yml). For each target triple the build job cargo builds the Rust CLI with --features cli, stages the binary into python/dirsql/_binary/, runs maturin build (which picks the binary up via the [tool.maturin] include rule in pyproject.toml), and the wheels + sdist are then trusted-published to PyPI.

Quick Start

import asyncio
import json
import os
import tempfile
from dirsql import DirSQL, Table

async def main():
    # Create some data files
    root = tempfile.mkdtemp()
    os.makedirs(os.path.join(root, "comments", "abc"), exist_ok=True)
    os.makedirs(os.path.join(root, "comments", "def"), exist_ok=True)

    with open(os.path.join(root, "comments", "abc", "index.jsonl"), "w") as f:
        f.write(json.dumps({"body": "looks good", "author": "alice"}) + "\n")
        f.write(json.dumps({"body": "needs work", "author": "bob"}) + "\n")

    with open(os.path.join(root, "comments", "def", "index.jsonl"), "w") as f:
        f.write(json.dumps({"body": "agreed", "author": "carol"}) + "\n")

    # Define a table: DDL, glob pattern, and an extract function
    db = DirSQL(
        root,
        tables=[
            Table(
                ddl="CREATE TABLE comments (id TEXT, body TEXT, author TEXT)",
                glob="comments/**/index.jsonl",
                extract=lambda path, content: [
                    {
                        "id": os.path.basename(os.path.dirname(path)),
                        "body": row["body"],
                        "author": row["author"],
                    }
                    for line in content.splitlines()
                    for row in [json.loads(line)]
                ],
            ),
        ],
    )
    await db.ready()

    # Query with SQL
    results = await db.query("SELECT * FROM comments WHERE author = 'alice'")
    # [{"id": "abc", "body": "looks good", "author": "alice"}]

asyncio.run(main())

Multiple Tables and Joins

db = DirSQL(
    root,
    tables=[
        Table(
            ddl="CREATE TABLE posts (title TEXT, author_id TEXT)",
            glob="posts/*.json",
            extract=lambda path, content: [json.loads(content)],
        ),
        Table(
            ddl="CREATE TABLE authors (id TEXT, name TEXT)",
            glob="authors/*.json",
            extract=lambda path, content: [json.loads(content)],
        ),
    ],
)
await db.ready()

results = await db.query("""
    SELECT posts.title, authors.name
    FROM posts JOIN authors ON posts.author_id = authors.id
""")

Ignoring Files

Pass ignore patterns to skip files during scanning and watching:

db = DirSQL(
    root,
    ignore=["**/drafts/**", "**/.git/**"],
    tables=[...],
)

Watching for Changes

DirSQL is async by default. The watch() method returns an async iterator of row-level change events.

import asyncio
import json
from dirsql import DirSQL, Table

async def main():
    db = DirSQL(
        "/path/to/data",
        tables=[
            Table(
                ddl="CREATE TABLE items (name TEXT)",
                glob="**/*.json",
                extract=lambda path, content: [json.loads(content)],
            ),
        ],
    )
    await db.ready()

    # Query
    results = await db.query("SELECT * FROM items")

    # Watch for file changes (insert/update/delete/error events)
    async for event in db.watch():
        print(f"{event.action} on {event.table}: {event.row}")
        if event.action == "error":
            print(f"  error: {event.error}")

asyncio.run(main())

API Reference

Table(*, ddl, glob, extract)

Defines how files map to a SQL table.

  • ddl (str): A CREATE TABLE statement defining the schema.
  • glob (str): A glob pattern matched against file paths relative to root.
  • extract (Callable[[str, str], list[dict]]): A function receiving (relative_path, file_content) and returning a list of row dicts. Each dict's keys must match the DDL column names.

DirSQL(root, *, tables, ignore=None)

Creates an in-memory SQLite database indexed from the directory at root. The constructor is sync and returns immediately; scanning runs in a background thread.

  • root (str): Path to the directory to index.
  • tables (list[Table]): Table definitions.
  • ignore (list[str] | None): Glob patterns for paths to skip.

await DirSQL.ready()

Wait for the initial scan to complete. Idempotent -- safe to call multiple times. Raises any exception that occurred during init.

await DirSQL.query(sql) -> list[dict]

Execute a SQL query. Returns a list of dicts keyed by column name. Internal tracking columns (_dirsql_*) are excluded from results.

DirSQL.watch() -> AsyncIterator[RowEvent]

Returns an async iterator that yields RowEvent objects as files change on disk. Starts the filesystem watcher on first iteration.

DirSQL.from_config(path) -> DirSQL

Create a DirSQL instance from a .dirsql.toml config file. Returns immediately; scanning runs in the background. Call await db.ready() before querying.

RowEvent

Emitted by watch() when a file change produces row-level diffs.

  • table (str): The affected table name.
  • action (str): One of "insert", "update", "delete", "error".
  • row (dict | None): The new row (for insert/update) or deleted row (for delete).
  • old_row (dict | None): The previous row (for update only).
  • error (str | None): Error message (for error events).
  • file_path (str | None): The relative file path that triggered the event.

How It Works

The Rust core (rusqlite + notify + walkdir) does the heavy lifting:

  1. Startup scan: Walks the directory tree, matches files to tables via glob patterns, calls the user-provided extract function for each file, and inserts rows into an in-memory SQLite database.
  2. File watching: Uses the notify crate (inotify on Linux, FSEvents on macOS) to detect file creates, modifications, and deletions.
  3. Row diffing: When a file changes, the new rows are diffed against the previous rows for that file, producing granular insert/update/delete events.
  4. Python bindings: PyO3 exposes the Rust core as a native Python extension module. The async layer runs blocking operations in a thread pool via asyncio.to_thread.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dirsql-0.2.2.tar.gz (104.1 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

dirsql-0.2.2-cp313-cp313-win_amd64.whl (5.1 MB view details)

Uploaded CPython 3.13Windows x86-64

dirsql-0.2.2-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (6.1 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ x86-64

dirsql-0.2.2-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (6.1 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ ARM64

dirsql-0.2.2-cp313-cp313-macosx_11_0_arm64.whl (5.5 MB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

dirsql-0.2.2-cp313-cp313-macosx_10_12_x86_64.whl (5.7 MB view details)

Uploaded CPython 3.13macOS 10.12+ x86-64

dirsql-0.2.2-cp312-cp312-win_amd64.whl (5.1 MB view details)

Uploaded CPython 3.12Windows x86-64

dirsql-0.2.2-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (6.1 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

dirsql-0.2.2-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (6.1 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ ARM64

dirsql-0.2.2-cp312-cp312-macosx_11_0_arm64.whl (5.5 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

dirsql-0.2.2-cp312-cp312-macosx_10_12_x86_64.whl (5.7 MB view details)

Uploaded CPython 3.12macOS 10.12+ x86-64

dirsql-0.2.2-cp311-cp311-win_amd64.whl (5.1 MB view details)

Uploaded CPython 3.11Windows x86-64

dirsql-0.2.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (6.1 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

dirsql-0.2.2-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (6.1 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ ARM64

dirsql-0.2.2-cp311-cp311-macosx_11_0_arm64.whl (5.5 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

dirsql-0.2.2-cp311-cp311-macosx_10_12_x86_64.whl (5.7 MB view details)

Uploaded CPython 3.11macOS 10.12+ x86-64

dirsql-0.2.2-cp310-cp310-win_amd64.whl (5.1 MB view details)

Uploaded CPython 3.10Windows x86-64

dirsql-0.2.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (6.1 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

dirsql-0.2.2-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (6.1 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ ARM64

dirsql-0.2.2-cp310-cp310-macosx_11_0_arm64.whl (5.5 MB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

dirsql-0.2.2-cp310-cp310-macosx_10_12_x86_64.whl (5.7 MB view details)

Uploaded CPython 3.10macOS 10.12+ x86-64

File details

Details for the file dirsql-0.2.2.tar.gz.

File metadata

  • Download URL: dirsql-0.2.2.tar.gz
  • Upload date:
  • Size: 104.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dirsql-0.2.2.tar.gz
Algorithm Hash digest
SHA256 f4dd4388d72dc9db2c6ba9bca9beac9450309afd12ae273ed1ccb7503caed2f2
MD5 d848c3c6fca2051720016466b6faf54d
BLAKE2b-256 af20e033c93e2ab7136cdf4134e6d72821241a4a09baad69a0ec3a0f25afadbb

See more details on using hashes here.

File details

Details for the file dirsql-0.2.2-cp313-cp313-win_amd64.whl.

File metadata

  • Download URL: dirsql-0.2.2-cp313-cp313-win_amd64.whl
  • Upload date:
  • Size: 5.1 MB
  • Tags: CPython 3.13, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dirsql-0.2.2-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 6e6da17b4447f387c3863a5ce83ed369644843d657c4c9ffc5534f3a1712ae82
MD5 a22f77a6385fe9b9f844f44889a2050e
BLAKE2b-256 c695d6ade0cc2218b8f39b6be4566c6f31c565adb72e58bd4567baab551efeba

See more details on using hashes here.

File details

Details for the file dirsql-0.2.2-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dirsql-0.2.2-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 82a7ba884321e7aa6e1057459760b56b2dcaa12ed884078c1e6a234ea3669d17
MD5 25524b664b3a419bbe9de3e076ec6a8b
BLAKE2b-256 4c96a6053ee2711cf3cbc2742767f70901f5f5e97383e961cae4b4bdb0de2689

See more details on using hashes here.

File details

Details for the file dirsql-0.2.2-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for dirsql-0.2.2-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 ba65d1004c3b871f3f49cc5d5d35a921dc7e52566a46582aff2cd7f3087afdcb
MD5 bb8580e896b162b5c7ad0f8667a8b3d1
BLAKE2b-256 41c39af98c0e944095454bf7a03dcbc0f824ce7149bcb469e51651b9a4cb41d5

See more details on using hashes here.

File details

Details for the file dirsql-0.2.2-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dirsql-0.2.2-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 52c1afa33340ff7b2f79863278484675f854093730099eb811fc48ed7d61926e
MD5 c78ab380a0f822b88eb3e33831aa92cb
BLAKE2b-256 78fbdc2d689d8c20b272dbf6186b18f2cdc626fce5b5bb9df188f1826cca9763

See more details on using hashes here.

File details

Details for the file dirsql-0.2.2-cp313-cp313-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for dirsql-0.2.2-cp313-cp313-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 2e5fd94196f1c7c9ab6f2fc6d005180c0f99844545a5de66ad98c4e566321587
MD5 63f1d5db385aa215b80769b6d4eca2c2
BLAKE2b-256 e825e081ac8aac61b304ca6bc6607aba378e443b15532a527ff7ec3c8e400aac

See more details on using hashes here.

File details

Details for the file dirsql-0.2.2-cp312-cp312-win_amd64.whl.

File metadata

  • Download URL: dirsql-0.2.2-cp312-cp312-win_amd64.whl
  • Upload date:
  • Size: 5.1 MB
  • Tags: CPython 3.12, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dirsql-0.2.2-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 4cc6b97f9ce93d601e01895fe27c0de7d3d38730f0c2589236af9801820d8dca
MD5 d8abab6894b17ac631c26d9ca20b4401
BLAKE2b-256 f8a8c7d5ec954784b284e1a1f1e84d8ffc771be59cd43e50efc42864af1515c1

See more details on using hashes here.

File details

Details for the file dirsql-0.2.2-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dirsql-0.2.2-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 9e2d9607898957577fc5f2843bcd30d8fc3d4af365a7c0a7262d7a89987e66a9
MD5 b50063d5d995b66636fc0b5b3d605c21
BLAKE2b-256 624ed6801788d020e1fcd4cd02c19e27a4422c963d1911a7b9fe578fbc91410e

See more details on using hashes here.

File details

Details for the file dirsql-0.2.2-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for dirsql-0.2.2-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 cc61fdd3d41aeed63c026de271c95c8288c1d93815a0625962dcf5ba5b8be998
MD5 5dcca5d0cfa8ba2e8a4491e32f77ac65
BLAKE2b-256 7137d52d398b3f45d7b342a7358a8c7bcc1a639fafbe2340b7cf64ff8a35bf2c

See more details on using hashes here.

File details

Details for the file dirsql-0.2.2-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dirsql-0.2.2-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 beca161f77a66029f28783aa2e369d917a920d4cd60dd0931bd1020ed0355c65
MD5 6f0c2e267fdc59d9f6179ed46903cd83
BLAKE2b-256 e6a8381d9164adad01bb697bad4357b24f7484fc43d9e668e94fb7c15da120e3

See more details on using hashes here.

File details

Details for the file dirsql-0.2.2-cp312-cp312-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for dirsql-0.2.2-cp312-cp312-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 149ca46c1d3873e9939aa9f081e8da83af40e5388e8e8d5650ee3b016f993108
MD5 4daa106f747482e83b8278186deae74d
BLAKE2b-256 2ee5a0238b867d5c5d532b13a3a9b6383405deb2ab052388933528800e3946fa

See more details on using hashes here.

File details

Details for the file dirsql-0.2.2-cp311-cp311-win_amd64.whl.

File metadata

  • Download URL: dirsql-0.2.2-cp311-cp311-win_amd64.whl
  • Upload date:
  • Size: 5.1 MB
  • Tags: CPython 3.11, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dirsql-0.2.2-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 65dd1b8a4e2616648a3320652134e8b302b2be2010752aa4ee82631417174cfa
MD5 eae36248aca6c2a70fc05feffe0358ab
BLAKE2b-256 8123bc038494bcc1900613bb79a0d2a2cbb458f9b41ff5b494da5e8ee3661412

See more details on using hashes here.

File details

Details for the file dirsql-0.2.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dirsql-0.2.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 890f3ced8a979301db000b0f7d6015c34dec28e63dbabab504d4da5929db0596
MD5 6e9cb1ae6d2e52d1e979d9b63a9d2148
BLAKE2b-256 29c2b42dfb46ff37087ac6975e9d6de2be4a3c1053025b05a49e25f05e4feecd

See more details on using hashes here.

File details

Details for the file dirsql-0.2.2-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for dirsql-0.2.2-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 dbd435d7495d26e4b073572a769b3d6ef3f028b77f3fffae0317532bd97f177d
MD5 88bb9a67df3994b2b7c3b42c8ee51629
BLAKE2b-256 55bc96e02e2f1dc0ed20e9757fd88ed6c4d41003e0c610d7ec92877899eaa61e

See more details on using hashes here.

File details

Details for the file dirsql-0.2.2-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dirsql-0.2.2-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 ca350853aeb047d74fad0bf0b79e40296721a43837a03d596521331752bfb789
MD5 c54aef278e0cda15519d1195c2d25b87
BLAKE2b-256 912b1a273c33af5c85637bd3b40f7835007f19d3c3815ab0b3cdaa17c436e81f

See more details on using hashes here.

File details

Details for the file dirsql-0.2.2-cp311-cp311-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for dirsql-0.2.2-cp311-cp311-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 66a5dd122b2150e1209445f89ad1385e062891c70a2e9db9233104509736185c
MD5 e0f66736a2dcb664b33634fea7526e02
BLAKE2b-256 5f0b3de8d76f418e1dd784025f24a92a619169e3ee8f02b3a21a6dc0cc562d13

See more details on using hashes here.

File details

Details for the file dirsql-0.2.2-cp310-cp310-win_amd64.whl.

File metadata

  • Download URL: dirsql-0.2.2-cp310-cp310-win_amd64.whl
  • Upload date:
  • Size: 5.1 MB
  • Tags: CPython 3.10, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dirsql-0.2.2-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 aa89837b86f876c5f592bdf21d03724e60924a90a7628c7558a669f609c237de
MD5 c187504f52469828e8e73b7051728afa
BLAKE2b-256 f24244814d4ea7e31e534a0e962c63cbeea63d56269a40a5e22ade2a19fae6bf

See more details on using hashes here.

File details

Details for the file dirsql-0.2.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dirsql-0.2.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 72c6b1db60d52e778ebe6a58b9065f76968392b9ec2dd56d319b915ad0cc8c72
MD5 b373b266693da8f03ee027f94bcc39b8
BLAKE2b-256 e31c0e59c13fd2c591ffd3621c1784572f09898ee9ef0f1f0ddf973299f77495

See more details on using hashes here.

File details

Details for the file dirsql-0.2.2-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for dirsql-0.2.2-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 a951010fd1c012e7df1cb94a94d33fffb51a2077891f27dbbdcb5800c31140fa
MD5 6170178646adf3c2ee6a8397b50785a2
BLAKE2b-256 336f58a5e0ccc5f4fc8a09b213ab6f81ed5738147c6a582577c3a20bc4ef5738

See more details on using hashes here.

File details

Details for the file dirsql-0.2.2-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dirsql-0.2.2-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 eea408ef0864faa4115f3f02820c3adf03909d6cbcdab8e2fad86b5a1777fe91
MD5 d577da46ff9bec9f384387cef4d64be9
BLAKE2b-256 2d44c976918260d8e56a6b8f47af541d090d768a1c695188ff9d2f5d0925d4ac

See more details on using hashes here.

File details

Details for the file dirsql-0.2.2-cp310-cp310-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for dirsql-0.2.2-cp310-cp310-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 908b30b6743c98a5659f58901c54ba2eec8da15e6407c05a06b30a5e2ad3524f
MD5 c3155ca73fc6a1ae8044f736a79ab38a
BLAKE2b-256 745686bb0db38481ba8cbde5d1bd975c76c1750cf67d8144d2263e457fbaf5b0

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page