Skip to main content

Ephemeral SQL index over a local directory

Project description

dirsql (Python SDK)

Ephemeral SQL index over a local directory. Watches a filesystem, ingests structured files into an in-memory SQLite database, and exposes a SQL query interface. The database is purely in-memory -- the filesystem is always the source of truth.

Documentation

Installation

pip install dirsql

Requires Python >= 3.12. Ships as a native extension (Rust via PyO3) -- binary wheels are provided for common platforms.

Each wheel also bundles the dirsql HTTP-server CLI as a console script, so pip install dirsql also gives you a dirsql command on $PATH. See the CLI guide.

Publishing (maintainers)

Handled by .github/workflows/publish.yml (invoked from minor-release.yml / patch-release.yml). For each target triple the build job cargo builds the Rust CLI with --features cli, stages the binary into python/dirsql/_binary/, runs maturin build (which picks the binary up via the [tool.maturin] include rule in pyproject.toml), and the wheels + sdist are then trusted-published to PyPI.

Quick Start

import asyncio
import json
import os
import tempfile
from dirsql import DirSQL, Table

async def main():
    # Create some data files
    root = tempfile.mkdtemp()
    os.makedirs(os.path.join(root, "comments", "abc"), exist_ok=True)
    os.makedirs(os.path.join(root, "comments", "def"), exist_ok=True)

    with open(os.path.join(root, "comments", "abc", "index.jsonl"), "w") as f:
        f.write(json.dumps({"body": "looks good", "author": "alice"}) + "\n")
        f.write(json.dumps({"body": "needs work", "author": "bob"}) + "\n")

    with open(os.path.join(root, "comments", "def", "index.jsonl"), "w") as f:
        f.write(json.dumps({"body": "agreed", "author": "carol"}) + "\n")

    # Define a table: DDL, glob pattern, and an extract function
    db = DirSQL(
        root,
        tables=[
            Table(
                ddl="CREATE TABLE comments (id TEXT, body TEXT, author TEXT)",
                glob="comments/**/index.jsonl",
                extract=lambda path, content: [
                    {
                        "id": os.path.basename(os.path.dirname(path)),
                        "body": row["body"],
                        "author": row["author"],
                    }
                    for line in content.splitlines()
                    for row in [json.loads(line)]
                ],
            ),
        ],
    )
    await db.ready()

    # Query with SQL
    results = await db.query("SELECT * FROM comments WHERE author = 'alice'")
    # [{"id": "abc", "body": "looks good", "author": "alice"}]

asyncio.run(main())

Multiple Tables and Joins

db = DirSQL(
    root,
    tables=[
        Table(
            ddl="CREATE TABLE posts (title TEXT, author_id TEXT)",
            glob="posts/*.json",
            extract=lambda path, content: [json.loads(content)],
        ),
        Table(
            ddl="CREATE TABLE authors (id TEXT, name TEXT)",
            glob="authors/*.json",
            extract=lambda path, content: [json.loads(content)],
        ),
    ],
)
await db.ready()

results = await db.query("""
    SELECT posts.title, authors.name
    FROM posts JOIN authors ON posts.author_id = authors.id
""")

Ignoring Files

Pass ignore patterns to skip files during scanning and watching:

db = DirSQL(
    root,
    ignore=["**/drafts/**", "**/.git/**"],
    tables=[...],
)

Watching for Changes

DirSQL is async by default. The watch() method returns an async iterator of row-level change events.

import asyncio
import json
from dirsql import DirSQL, Table

async def main():
    db = DirSQL(
        "/path/to/data",
        tables=[
            Table(
                ddl="CREATE TABLE items (name TEXT)",
                glob="**/*.json",
                extract=lambda path, content: [json.loads(content)],
            ),
        ],
    )
    await db.ready()

    # Query
    results = await db.query("SELECT * FROM items")

    # Watch for file changes (insert/update/delete/error events)
    async for event in db.watch():
        print(f"{event.action} on {event.table}: {event.row}")
        if event.action == "error":
            print(f"  error: {event.error}")

asyncio.run(main())

API Reference

Table(*, ddl, glob, extract)

Defines how files map to a SQL table.

  • ddl (str): A CREATE TABLE statement defining the schema.
  • glob (str): A glob pattern matched against file paths relative to root.
  • extract (Callable[[str, str], list[dict]]): A function receiving (relative_path, file_content) and returning a list of row dicts. Each dict's keys must match the DDL column names.

DirSQL(root, *, tables, ignore=None)

Creates an in-memory SQLite database indexed from the directory at root. The constructor is sync and returns immediately; scanning runs in a background thread.

  • root (str): Path to the directory to index.
  • tables (list[Table]): Table definitions.
  • ignore (list[str] | None): Glob patterns for paths to skip.

await DirSQL.ready()

Wait for the initial scan to complete. Idempotent -- safe to call multiple times. Raises any exception that occurred during init.

await DirSQL.query(sql) -> list[dict]

Execute a SQL query. Returns a list of dicts keyed by column name. Internal tracking columns (_dirsql_*) are excluded from results.

DirSQL.watch() -> AsyncIterator[RowEvent]

Returns an async iterator that yields RowEvent objects as files change on disk. Starts the filesystem watcher on first iteration.

DirSQL.from_config(path) -> DirSQL

Create a DirSQL instance from a .dirsql.toml config file. Returns immediately; scanning runs in the background. Call await db.ready() before querying.

RowEvent

Emitted by watch() when a file change produces row-level diffs.

  • table (str): The affected table name.
  • action (str): One of "insert", "update", "delete", "error".
  • row (dict | None): The new row (for insert/update) or deleted row (for delete).
  • old_row (dict | None): The previous row (for update only).
  • error (str | None): Error message (for error events).
  • file_path (str | None): The relative file path that triggered the event.

How It Works

The Rust core (rusqlite + notify + walkdir) does the heavy lifting:

  1. Startup scan: Walks the directory tree, matches files to tables via glob patterns, calls the user-provided extract function for each file, and inserts rows into an in-memory SQLite database.
  2. File watching: Uses the notify crate (inotify on Linux, FSEvents on macOS) to detect file creates, modifications, and deletions.
  3. Row diffing: When a file changes, the new rows are diffed against the previous rows for that file, producing granular insert/update/delete events.
  4. Python bindings: PyO3 exposes the Rust core as a native Python extension module. The async layer runs blocking operations in a thread pool via asyncio.to_thread.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dirsql-0.1.11.tar.gz (103.2 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

dirsql-0.1.11-cp313-cp313-win_amd64.whl (5.1 MB view details)

Uploaded CPython 3.13Windows x86-64

dirsql-0.1.11-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (6.1 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ x86-64

dirsql-0.1.11-cp313-cp313-macosx_11_0_arm64.whl (5.5 MB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

dirsql-0.1.11-cp313-cp313-macosx_10_12_x86_64.whl (5.7 MB view details)

Uploaded CPython 3.13macOS 10.12+ x86-64

dirsql-0.1.11-cp312-cp312-win_amd64.whl (5.1 MB view details)

Uploaded CPython 3.12Windows x86-64

dirsql-0.1.11-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (6.1 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

dirsql-0.1.11-cp312-cp312-macosx_11_0_arm64.whl (5.5 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

dirsql-0.1.11-cp312-cp312-macosx_10_12_x86_64.whl (5.7 MB view details)

Uploaded CPython 3.12macOS 10.12+ x86-64

dirsql-0.1.11-cp311-cp311-win_amd64.whl (5.1 MB view details)

Uploaded CPython 3.11Windows x86-64

dirsql-0.1.11-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (6.1 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

dirsql-0.1.11-cp311-cp311-macosx_11_0_arm64.whl (5.5 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

dirsql-0.1.11-cp311-cp311-macosx_10_12_x86_64.whl (5.7 MB view details)

Uploaded CPython 3.11macOS 10.12+ x86-64

dirsql-0.1.11-cp310-cp310-win_amd64.whl (5.1 MB view details)

Uploaded CPython 3.10Windows x86-64

dirsql-0.1.11-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (6.1 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

dirsql-0.1.11-cp310-cp310-macosx_11_0_arm64.whl (5.5 MB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

dirsql-0.1.11-cp310-cp310-macosx_10_12_x86_64.whl (5.7 MB view details)

Uploaded CPython 3.10macOS 10.12+ x86-64

File details

Details for the file dirsql-0.1.11.tar.gz.

File metadata

  • Download URL: dirsql-0.1.11.tar.gz
  • Upload date:
  • Size: 103.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dirsql-0.1.11.tar.gz
Algorithm Hash digest
SHA256 7a0898532d2c886b87aaa6bf5b5193521869b753f62b5efef99d3734a616e81b
MD5 cb0cd7b6a884b0e5280d593790b5d9e5
BLAKE2b-256 97d92f05b5ed543875cdf4ef0fba54f848eaadd3f71bcdf3ef63530888983007

See more details on using hashes here.

File details

Details for the file dirsql-0.1.11-cp313-cp313-win_amd64.whl.

File metadata

  • Download URL: dirsql-0.1.11-cp313-cp313-win_amd64.whl
  • Upload date:
  • Size: 5.1 MB
  • Tags: CPython 3.13, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dirsql-0.1.11-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 a06fb367129ecaeef6fb91a65c5a93d21ff0247064fcb0e3de5e90ea9d97db03
MD5 4a68dcfeafc0af456ebf24f0e42c1451
BLAKE2b-256 f09197de12ce289db860644a24a8b84395f089e25da852ee4a52fd17fead6bef

See more details on using hashes here.

File details

Details for the file dirsql-0.1.11-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dirsql-0.1.11-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 fe58b873473487fa020672479ebda0c88142f5c395a35775c80faa75a4e25943
MD5 fcb2502d2aacb4d5a99605840d5d23b1
BLAKE2b-256 cc05963f06b98056fb219e2375d389fa9d318cd5d2ae61b8cd8ba32b0df326b8

See more details on using hashes here.

File details

Details for the file dirsql-0.1.11-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dirsql-0.1.11-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 9ff44b8d05d04d0739797d78e8c5282fb7100b4c69d41f4f49ac7d3f4f31df4d
MD5 b085e8f126149099ed1643eae7b5eac3
BLAKE2b-256 2c96195c42e43798fd5beddb9fe7b0ffa93b4912b9460a98659726ac3f398b7d

See more details on using hashes here.

File details

Details for the file dirsql-0.1.11-cp313-cp313-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for dirsql-0.1.11-cp313-cp313-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 ef9d4424d7c6b0d5d2b9af9712cbef77c426aa9692db4270d98a40a29f66a3c8
MD5 9c5777008e6e7e5950f15f4aa32aa05e
BLAKE2b-256 3a66ece92af51943894c37122d0ddaafd7f48215297d4039ba55a60a89343cfe

See more details on using hashes here.

File details

Details for the file dirsql-0.1.11-cp312-cp312-win_amd64.whl.

File metadata

  • Download URL: dirsql-0.1.11-cp312-cp312-win_amd64.whl
  • Upload date:
  • Size: 5.1 MB
  • Tags: CPython 3.12, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dirsql-0.1.11-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 1e3423e0d7481a4f7e30a8e24e1baea49a7446dab95836755a88fde1a4fd609c
MD5 f1fecd94464b29f5812fbf7d63548a6a
BLAKE2b-256 fbe5fbf483bcb0c95e861f043d4ee2e3774924d480a1194bf8d5ca7c06664634

See more details on using hashes here.

File details

Details for the file dirsql-0.1.11-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dirsql-0.1.11-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 5b511f7b3b34e5574bdb3ca5a97961b709f607d466c0f7dc599490db3c8e5bab
MD5 1b21697a857cc541f44a665d9a539cb7
BLAKE2b-256 65da4fda3c46b10925fa1dde9d4058e451099cedd7843bb9d3220146da7f4e24

See more details on using hashes here.

File details

Details for the file dirsql-0.1.11-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dirsql-0.1.11-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 432814314381e11a17fdc822f1380c6eb7ea4565f30d7fa022faeca5408e3634
MD5 fc2c959fc43ae1819ad0310e65153a78
BLAKE2b-256 11863e7dedd2c06f3b07655c71b28d27200ffe702d8330951722fd80f67e242b

See more details on using hashes here.

File details

Details for the file dirsql-0.1.11-cp312-cp312-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for dirsql-0.1.11-cp312-cp312-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 89397463e66810c59151d83240e2c11fb41cc9d139d85ce0e5d9639525eddfa2
MD5 683e0d33693d203f52510cfa6bbc4b76
BLAKE2b-256 adecc336c856f50e44924c151c749cb71f0b5308f59c237a9ab2c0655d57c005

See more details on using hashes here.

File details

Details for the file dirsql-0.1.11-cp311-cp311-win_amd64.whl.

File metadata

  • Download URL: dirsql-0.1.11-cp311-cp311-win_amd64.whl
  • Upload date:
  • Size: 5.1 MB
  • Tags: CPython 3.11, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dirsql-0.1.11-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 6b152a84bdebd6264d6d62b6c4ef9d601eb34b9d5f31039bdc391034870fb9f9
MD5 ce5f4d9d31361967e509180bafcfbc9e
BLAKE2b-256 3af645d1a51c42728675e95bec715db250cb620b12a5a6ff15cc05be7e5adb0b

See more details on using hashes here.

File details

Details for the file dirsql-0.1.11-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dirsql-0.1.11-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 1b60b8dbde08878e72af5d93f85694e2bd87ac1a9dc1581e97e94bfecaa7a402
MD5 0407d5504c77ed210736288a0477cd84
BLAKE2b-256 73cfe44e6653ff3824d0cbbf1419ca59aa75e8dcce540ae2f6e4a05e471a85ff

See more details on using hashes here.

File details

Details for the file dirsql-0.1.11-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dirsql-0.1.11-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 fbe8460bdc764667749eaf686594c2dae36d7b6d32f4c70c0df69b6351caa2d3
MD5 2daf74c73975a696c4e34a785b6a0a77
BLAKE2b-256 612d16262cace343cdeeed9631e3d65df9d6a5886dc664c0fb66f8eb32454001

See more details on using hashes here.

File details

Details for the file dirsql-0.1.11-cp311-cp311-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for dirsql-0.1.11-cp311-cp311-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 54f5ff475c880d98ad52117ab21a5e02ff288137a294128df1d276692c2f43d9
MD5 3112096fbf41b988c6c4d89499710006
BLAKE2b-256 51e906292cb5d5cba735f7c7d549ef7f6eaad8ecc40f1737fbf4a46a1a41100c

See more details on using hashes here.

File details

Details for the file dirsql-0.1.11-cp310-cp310-win_amd64.whl.

File metadata

  • Download URL: dirsql-0.1.11-cp310-cp310-win_amd64.whl
  • Upload date:
  • Size: 5.1 MB
  • Tags: CPython 3.10, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dirsql-0.1.11-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 dab514dc0ae599663ff80262d563ab1f3c4063fb07b8add166db6a8e2ff4ccee
MD5 b79d8374f87269c3a6f92ad3b0784434
BLAKE2b-256 826e594450bb5643e173bc0ee883c01ba165359505d54a6c8c1aa2e2e748063a

See more details on using hashes here.

File details

Details for the file dirsql-0.1.11-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dirsql-0.1.11-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 4c6aec024cdb4d55785b1835ba3278b2d1ed40f23c4862ac964b87028b1f69ae
MD5 c0fb2cb1a1d16f423911c2622ad6c9b3
BLAKE2b-256 d0f952902583037d1fe94581fd8c8a1d61c2dbabb9edfd2ef93fae3c85c0e503

See more details on using hashes here.

File details

Details for the file dirsql-0.1.11-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dirsql-0.1.11-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 73ce8400d274e2eb9fecc25c9778856addcad253a65098ca8c18e591b79e0569
MD5 4cba4ce68e0a83095efa02dcd7404a08
BLAKE2b-256 1c21f5fa67551d5421388900148dc8057536066dab97ad181c1b217551ddd652

See more details on using hashes here.

File details

Details for the file dirsql-0.1.11-cp310-cp310-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for dirsql-0.1.11-cp310-cp310-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 197a86da2412f1473ada826777d1908cf174806827322e2fde172001403f72e4
MD5 525bd59f1cd438a0fda93b726f95efc2
BLAKE2b-256 30b4528119eea7d923e95b5c406dc541737cd94944a5534cc2839b8a0e19a8f9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page