Skip to main content

Ephemeral SQL index over a local directory

Project description

dirsql (Python SDK)

Ephemeral SQL index over a local directory. Watches a filesystem, ingests structured files into an in-memory SQLite database, and exposes a SQL query interface. The database is purely in-memory -- the filesystem is always the source of truth.

Documentation

Installation

pip install dirsql

Requires Python >= 3.12. Ships as a native extension (Rust via PyO3) -- binary wheels are provided for common platforms.

Each wheel also bundles the dirsql HTTP-server CLI as a console script, so pip install dirsql also gives you a dirsql command on $PATH. See the CLI guide.

Publishing (maintainers)

Handled by .github/workflows/publish.yml (invoked from minor-release.yml / patch-release.yml). For each target triple the build job cargo builds the Rust CLI with --features cli, stages the binary into dirsql/_binary/, runs maturin build (which picks the binary up via the [tool.maturin] include rule in pyproject.toml), and the wheels + sdist are then trusted-published to PyPI.

Quick Start

import asyncio
import json
import os
import tempfile
from dirsql import DirSQL, Table

async def main():
    # Create some data files
    root = tempfile.mkdtemp()
    os.makedirs(os.path.join(root, "comments", "abc"), exist_ok=True)
    os.makedirs(os.path.join(root, "comments", "def"), exist_ok=True)

    with open(os.path.join(root, "comments", "abc", "index.jsonl"), "w") as f:
        f.write(json.dumps({"body": "looks good", "author": "alice"}) + "\n")
        f.write(json.dumps({"body": "needs work", "author": "bob"}) + "\n")

    with open(os.path.join(root, "comments", "def", "index.jsonl"), "w") as f:
        f.write(json.dumps({"body": "agreed", "author": "carol"}) + "\n")

    # Define a table: DDL, glob pattern, and an extract function
    db = DirSQL(
        root,
        tables=[
            Table(
                ddl="CREATE TABLE comments (id TEXT, body TEXT, author TEXT)",
                glob="comments/**/index.jsonl",
                extract=lambda path: [
                    {
                        "id": os.path.basename(os.path.dirname(path)),
                        "body": row["body"],
                        "author": row["author"],
                    }
                    for line in open(path, encoding="utf-8").read().splitlines()
                    for row in [json.loads(line)]
                ],
            ),
        ],
    )
    await db.ready()

    # Query with SQL
    results = await db.query("SELECT * FROM comments WHERE author = 'alice'")
    # [{"id": "abc", "body": "looks good", "author": "alice"}]

asyncio.run(main())

Multiple Tables and Joins

db = DirSQL(
    root,
    tables=[
        Table(
            ddl="CREATE TABLE posts (title TEXT, author_id TEXT)",
            glob="posts/*.json",
            extract=lambda path: [json.loads(open(path, encoding="utf-8").read())],
        ),
        Table(
            ddl="CREATE TABLE authors (id TEXT, name TEXT)",
            glob="authors/*.json",
            extract=lambda path: [json.loads(open(path, encoding="utf-8").read())],
        ),
    ],
)
await db.ready()

results = await db.query("""
    SELECT posts.title, authors.name
    FROM posts JOIN authors ON posts.author_id = authors.id
""")

Ignoring Files

Pass ignore patterns to skip files during scanning and watching:

db = DirSQL(
    root,
    ignore=["**/drafts/**", "**/.git/**"],
    tables=[...],
)

Watching for Changes

DirSQL is async by default. The watch() method returns an async iterator of row-level change events.

import asyncio
import json
from dirsql import DirSQL, Table

async def main():
    db = DirSQL(
        "/path/to/data",
        tables=[
            Table(
                ddl="CREATE TABLE items (name TEXT)",
                glob="**/*.json",
                extract=lambda path: [json.loads(open(path, encoding="utf-8").read())],
            ),
        ],
    )
    await db.ready()

    # Query
    results = await db.query("SELECT * FROM items")

    # Watch for file changes (insert/update/delete/error events)
    async for event in db.watch():
        print(f"{event.action} on {event.table}: {event.row}")
        if event.action == "error":
            print(f"  error: {event.error}")

asyncio.run(main())

API Reference

Table(*, ddl, glob, extract)

Defines how files map to a SQL table.

  • ddl (str): A CREATE TABLE statement defining the schema.
  • glob (str): A glob pattern matched against file paths relative to root.
  • extract (Callable[[str], list[dict]]): A function receiving the matched file's absolute filesystem path and returning a list of row dicts. dirsql does not read file contents; a callback that needs the file body reads it itself (e.g. open(path, encoding="utf-8").read()). Each dict's keys must match the DDL column names.

DirSQL(root=None, *, tables=None, ignore=None, config=None)

Creates an in-memory SQLite database indexed from the directory at root. The constructor is sync and returns immediately; scanning runs in a background thread.

At least one of root or config must be supplied. When both root and config are passed (or config declares [dirsql].root), the explicit root wins and a warning is emitted on stderr.

  • root (str | None): Path to the directory to index. Optional when config supplies one.
  • tables (list[Table] | None): Programmatic table definitions. Appended to any tables in the config file.
  • ignore (list[str] | None): Glob patterns for paths to skip. Appended to any [dirsql].ignore patterns in the config file.
  • config (str | None): Optional path to a .dirsql.toml file. Its [[table]] entries, [dirsql].ignore, and optional [dirsql].root are merged into the constructor's inputs.

await DirSQL.ready()

Wait for the initial scan to complete. Idempotent -- safe to call multiple times. Raises any exception that occurred during init.

await DirSQL.query(sql) -> list[dict]

Execute a SQL query. Returns a list of dicts keyed by column name. Internal tracking columns (_dirsql_*) are excluded from results.

DirSQL.watch() -> AsyncIterator[RowEvent]

Returns an async iterator that yields RowEvent objects as files change on disk. Starts the filesystem watcher on first iteration.

RowEvent

Emitted by watch() when a file change produces row-level diffs.

  • table (str): The affected table name.
  • action (str): One of "insert", "update", "delete", "error".
  • row (dict | None): The new row (for insert/update) or deleted row (for delete).
  • old_row (dict | None): The previous row (for update only).
  • error (str | None): Error message (for error events).
  • file_path (str | None): The relative file path that triggered the event.

How It Works

The Rust core (rusqlite + notify + walkdir) does the heavy lifting:

  1. Startup scan: Walks the directory tree, matches files to tables via glob patterns, calls the user-provided extract function for each file, and inserts rows into an in-memory SQLite database.
  2. File watching: Uses the notify crate (inotify on Linux, FSEvents on macOS) to detect file creates, modifications, and deletions.
  3. Row diffing: When a file changes, the new rows are diffed against the previous rows for that file, producing granular insert/update/delete events.
  4. Python bindings: PyO3 exposes the Rust core as a native Python extension module. The async layer runs blocking operations in a thread pool via asyncio.to_thread.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dirsql-0.3.7.tar.gz (241.7 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

dirsql-0.3.7-cp313-cp313-win_amd64.whl (5.0 MB view details)

Uploaded CPython 3.13Windows x86-64

dirsql-0.3.7-cp313-cp313-manylinux_2_34_x86_64.whl (6.0 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.34+ x86-64

dirsql-0.3.7-cp313-cp313-manylinux_2_34_aarch64.whl (5.9 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.34+ ARM64

dirsql-0.3.7-cp313-cp313-macosx_11_0_arm64.whl (5.2 MB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

dirsql-0.3.7-cp313-cp313-macosx_10_12_x86_64.whl (5.5 MB view details)

Uploaded CPython 3.13macOS 10.12+ x86-64

dirsql-0.3.7-cp312-cp312-win_amd64.whl (5.0 MB view details)

Uploaded CPython 3.12Windows x86-64

dirsql-0.3.7-cp312-cp312-manylinux_2_34_x86_64.whl (6.0 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.34+ x86-64

dirsql-0.3.7-cp312-cp312-manylinux_2_34_aarch64.whl (5.9 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.34+ ARM64

dirsql-0.3.7-cp312-cp312-macosx_11_0_arm64.whl (5.2 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

dirsql-0.3.7-cp312-cp312-macosx_10_12_x86_64.whl (5.5 MB view details)

Uploaded CPython 3.12macOS 10.12+ x86-64

dirsql-0.3.7-cp311-cp311-win_amd64.whl (5.0 MB view details)

Uploaded CPython 3.11Windows x86-64

dirsql-0.3.7-cp311-cp311-manylinux_2_34_x86_64.whl (6.0 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.34+ x86-64

dirsql-0.3.7-cp311-cp311-manylinux_2_34_aarch64.whl (5.9 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.34+ ARM64

dirsql-0.3.7-cp311-cp311-macosx_11_0_arm64.whl (5.2 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

dirsql-0.3.7-cp311-cp311-macosx_10_12_x86_64.whl (5.5 MB view details)

Uploaded CPython 3.11macOS 10.12+ x86-64

File details

Details for the file dirsql-0.3.7.tar.gz.

File metadata

  • Download URL: dirsql-0.3.7.tar.gz
  • Upload date:
  • Size: 241.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dirsql-0.3.7.tar.gz
Algorithm Hash digest
SHA256 38ff4e0893c5fb9df07a93ca5ef343d8b93cbb05060504dc93372303cc85f2ce
MD5 cca4d62b99a9c38da6b36a2066d61f95
BLAKE2b-256 ef9fb2c02e49fdec7bf7eb2e6d19e0669c0b9d452285e765c572f191cb740c0e

See more details on using hashes here.

Provenance

The following attestation bundles were made for dirsql-0.3.7.tar.gz:

Publisher: release.yml on thekevinscott/dirsql

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dirsql-0.3.7-cp313-cp313-win_amd64.whl.

File metadata

  • Download URL: dirsql-0.3.7-cp313-cp313-win_amd64.whl
  • Upload date:
  • Size: 5.0 MB
  • Tags: CPython 3.13, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dirsql-0.3.7-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 865c282e096909c363e43dc2e760fd8146e51a9a3d7d8f9167a773ed34358e0d
MD5 97cb427573a2226fdba954543f25a7ed
BLAKE2b-256 68173d1312911ac8463ae95b65d0b0aa6e7304aee717369318ee9f17c9420d50

See more details on using hashes here.

Provenance

The following attestation bundles were made for dirsql-0.3.7-cp313-cp313-win_amd64.whl:

Publisher: release.yml on thekevinscott/dirsql

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dirsql-0.3.7-cp313-cp313-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for dirsql-0.3.7-cp313-cp313-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 6f64393a72ba1490089341f3ed997ea011a0389f767ea25d6150fa70c89a6378
MD5 7984ea10b70035c016395cdaeca984ce
BLAKE2b-256 533616777d5024f69fd7a4c7011ba405a067b190d9c0707a805e9ad5b3b91006

See more details on using hashes here.

Provenance

The following attestation bundles were made for dirsql-0.3.7-cp313-cp313-manylinux_2_34_x86_64.whl:

Publisher: release.yml on thekevinscott/dirsql

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dirsql-0.3.7-cp313-cp313-manylinux_2_34_aarch64.whl.

File metadata

File hashes

Hashes for dirsql-0.3.7-cp313-cp313-manylinux_2_34_aarch64.whl
Algorithm Hash digest
SHA256 36a8976ca04bb339344e9be978ee80c86a722391e3efe1083005fb5a239614a5
MD5 9a729a10bf7d6bc6e105ef97d630bee7
BLAKE2b-256 d8ba8e31160ab461ad49e2562b329d505c5e8b897bf1822b401f9f4495e918c8

See more details on using hashes here.

Provenance

The following attestation bundles were made for dirsql-0.3.7-cp313-cp313-manylinux_2_34_aarch64.whl:

Publisher: release.yml on thekevinscott/dirsql

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dirsql-0.3.7-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dirsql-0.3.7-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 9f62a0e6beb1eb5987d95363111062e2a94d62cd11402000eb65d9e83200c5ff
MD5 98a6b0c1bf0a9ca189cc32b96658854c
BLAKE2b-256 420ae2ef13c7058c550fca4b7dcf5a2556864d9584492afed007f69eadfd430a

See more details on using hashes here.

Provenance

The following attestation bundles were made for dirsql-0.3.7-cp313-cp313-macosx_11_0_arm64.whl:

Publisher: release.yml on thekevinscott/dirsql

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dirsql-0.3.7-cp313-cp313-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for dirsql-0.3.7-cp313-cp313-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 57cb2e9525c41fdeddad994b914174a8adb83ff53e79676b98fe1775882da7b8
MD5 f0daa34ca9d7d5b238c14e9368fcc633
BLAKE2b-256 6d183c780c2f6daa8d1655ee243c78fccb819650f6cb2ecca1ebe97284674b1f

See more details on using hashes here.

Provenance

The following attestation bundles were made for dirsql-0.3.7-cp313-cp313-macosx_10_12_x86_64.whl:

Publisher: release.yml on thekevinscott/dirsql

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dirsql-0.3.7-cp312-cp312-win_amd64.whl.

File metadata

  • Download URL: dirsql-0.3.7-cp312-cp312-win_amd64.whl
  • Upload date:
  • Size: 5.0 MB
  • Tags: CPython 3.12, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dirsql-0.3.7-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 58e1d1c058f935da10f8f73e0b318f9758c23e964391624f75afd2e66590401d
MD5 589e23a338910a57fb90a6b57d0a40de
BLAKE2b-256 62b0ded8e9be3bcf266651b58ab50182a5360c5b5f7dda3367de8c12c9fdc7a6

See more details on using hashes here.

Provenance

The following attestation bundles were made for dirsql-0.3.7-cp312-cp312-win_amd64.whl:

Publisher: release.yml on thekevinscott/dirsql

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dirsql-0.3.7-cp312-cp312-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for dirsql-0.3.7-cp312-cp312-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 f6b74e2519ece2698a55120533127f0affa6b461a8e470ccf929cee10689b559
MD5 acb501c44a5682290fb1579de2a25ad2
BLAKE2b-256 0ad55abeb99a00866c57abd868095bc1e9dcadc7943f7c69687b54912d3f52fe

See more details on using hashes here.

Provenance

The following attestation bundles were made for dirsql-0.3.7-cp312-cp312-manylinux_2_34_x86_64.whl:

Publisher: release.yml on thekevinscott/dirsql

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dirsql-0.3.7-cp312-cp312-manylinux_2_34_aarch64.whl.

File metadata

File hashes

Hashes for dirsql-0.3.7-cp312-cp312-manylinux_2_34_aarch64.whl
Algorithm Hash digest
SHA256 95e9c403ea16ae8185fe45ba117f1b96970eccb7dad38729e40c3c8c0bad1ad9
MD5 27c628d73cd01691ba4774bfaa843878
BLAKE2b-256 fd0fd23bb61b8bf9b1f6cd427bde3121986949a3a4c348d2c85ef237ec624afd

See more details on using hashes here.

Provenance

The following attestation bundles were made for dirsql-0.3.7-cp312-cp312-manylinux_2_34_aarch64.whl:

Publisher: release.yml on thekevinscott/dirsql

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dirsql-0.3.7-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dirsql-0.3.7-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 8a76452353d320c988154fd93c3873ef990b9e2653d6e2cc2a354a070bbb8298
MD5 0a9ed1c57d6f879799ce4be6e8cbc55f
BLAKE2b-256 c0a569504f7fe6b33811e9573ac137012709b98af99acb85ee50562c141a9e69

See more details on using hashes here.

Provenance

The following attestation bundles were made for dirsql-0.3.7-cp312-cp312-macosx_11_0_arm64.whl:

Publisher: release.yml on thekevinscott/dirsql

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dirsql-0.3.7-cp312-cp312-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for dirsql-0.3.7-cp312-cp312-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 9b95060104c9fdc000922c27be131a4f274921e98b681f5064cbb5e270931030
MD5 c4271a73107ad86834d0892001cdf2c6
BLAKE2b-256 43afc6b2c2ae35344dc2f2f2185819b71e8459c81d328f9e5379448a0e6ec5ed

See more details on using hashes here.

Provenance

The following attestation bundles were made for dirsql-0.3.7-cp312-cp312-macosx_10_12_x86_64.whl:

Publisher: release.yml on thekevinscott/dirsql

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dirsql-0.3.7-cp311-cp311-win_amd64.whl.

File metadata

  • Download URL: dirsql-0.3.7-cp311-cp311-win_amd64.whl
  • Upload date:
  • Size: 5.0 MB
  • Tags: CPython 3.11, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dirsql-0.3.7-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 6e41de144d7ad75837cf354e90a2beaab48b68e098e014c4b77c0980a0001cbc
MD5 39567dcf37a31516f341a4622fb86c38
BLAKE2b-256 26087a795036dc5db069bc1ef0e32a3bd5dc55dad636cfe6ff9cb57c2991e289

See more details on using hashes here.

Provenance

The following attestation bundles were made for dirsql-0.3.7-cp311-cp311-win_amd64.whl:

Publisher: release.yml on thekevinscott/dirsql

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dirsql-0.3.7-cp311-cp311-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for dirsql-0.3.7-cp311-cp311-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 9fc1eb6b8fe44d3e9a776d52a86d01c384e4ecc1d1a1def997a74454e38db8df
MD5 208e7045a1b3bc1cbedf3f2481f30f07
BLAKE2b-256 7726cf61ff42c5e4de6efae4d3d570348e3d2fdd1d73ffab5e36c4209dee8a7f

See more details on using hashes here.

Provenance

The following attestation bundles were made for dirsql-0.3.7-cp311-cp311-manylinux_2_34_x86_64.whl:

Publisher: release.yml on thekevinscott/dirsql

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dirsql-0.3.7-cp311-cp311-manylinux_2_34_aarch64.whl.

File metadata

File hashes

Hashes for dirsql-0.3.7-cp311-cp311-manylinux_2_34_aarch64.whl
Algorithm Hash digest
SHA256 ef1d164635fa003ed4d41136735d61c90d315ca3f4b420667be7a189c422682d
MD5 168d0dcea199fdd12dd77a10cd9e6afc
BLAKE2b-256 0b2280331c7ec42894c36e6606c87fda570a8dc6052848335f34e0d4b4ddb52b

See more details on using hashes here.

Provenance

The following attestation bundles were made for dirsql-0.3.7-cp311-cp311-manylinux_2_34_aarch64.whl:

Publisher: release.yml on thekevinscott/dirsql

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dirsql-0.3.7-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dirsql-0.3.7-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 1bf05bf372cdb6442bcd12e9ec77e39fb61d2055d53d737ef1e6808bc13a9502
MD5 93ac0d5f7d7060a08bd8cd5871e5fd65
BLAKE2b-256 fdbbf6c635f0454c26a0e31f21e00013cec7ae6ba0a0bff32faefd00659e6e9b

See more details on using hashes here.

Provenance

The following attestation bundles were made for dirsql-0.3.7-cp311-cp311-macosx_11_0_arm64.whl:

Publisher: release.yml on thekevinscott/dirsql

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dirsql-0.3.7-cp311-cp311-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for dirsql-0.3.7-cp311-cp311-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 934cc21cb055ab00f588cc34d5af9053353de782ccb35cb23d3a61941d3d9094
MD5 91ee413eb3fcd07826f4e49c9ab4b8b3
BLAKE2b-256 2fa97c3698d7af2a8106f44fe7ee82b28d90d4bfd6c567e033a58d0de11f91ab

See more details on using hashes here.

Provenance

The following attestation bundles were made for dirsql-0.3.7-cp311-cp311-macosx_10_12_x86_64.whl:

Publisher: release.yml on thekevinscott/dirsql

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page