Skip to main content

Ephemeral SQL index over a local directory

Project description

dirsql (Python SDK)

Ephemeral SQL index over a local directory. Watches a filesystem, ingests structured files into an in-memory SQLite database, and exposes a SQL query interface. The database is purely in-memory -- the filesystem is always the source of truth.

Documentation

Installation

pip install dirsql

Requires Python >= 3.12. Ships as a native extension (Rust via PyO3) -- binary wheels are provided for common platforms.

Each wheel also bundles the dirsql HTTP-server CLI as a console script, so pip install dirsql also gives you a dirsql command on $PATH. See the CLI guide.

Publishing (maintainers)

Handled by .github/workflows/publish.yml (invoked from minor-release.yml / patch-release.yml). For each target triple the build job cargo builds the Rust CLI with --features cli, stages the binary into dirsql/_binary/, runs maturin build (which picks the binary up via the [tool.maturin] include rule in pyproject.toml), and the wheels + sdist are then trusted-published to PyPI.

Quick Start

import asyncio
import json
import os
import tempfile
from dirsql import DirSQL, Table

async def main():
    # Create some data files
    root = tempfile.mkdtemp()
    os.makedirs(os.path.join(root, "comments", "abc"), exist_ok=True)
    os.makedirs(os.path.join(root, "comments", "def"), exist_ok=True)

    with open(os.path.join(root, "comments", "abc", "index.jsonl"), "w") as f:
        f.write(json.dumps({"body": "looks good", "author": "alice"}) + "\n")
        f.write(json.dumps({"body": "needs work", "author": "bob"}) + "\n")

    with open(os.path.join(root, "comments", "def", "index.jsonl"), "w") as f:
        f.write(json.dumps({"body": "agreed", "author": "carol"}) + "\n")

    # Define a table: DDL, glob pattern, and an extract function
    db = DirSQL(
        root,
        tables=[
            Table(
                ddl="CREATE TABLE comments (id TEXT, body TEXT, author TEXT)",
                glob="comments/**/index.jsonl",
                extract=lambda path, content: [
                    {
                        "id": os.path.basename(os.path.dirname(path)),
                        "body": row["body"],
                        "author": row["author"],
                    }
                    for line in content.splitlines()
                    for row in [json.loads(line)]
                ],
            ),
        ],
    )
    await db.ready()

    # Query with SQL
    results = await db.query("SELECT * FROM comments WHERE author = 'alice'")
    # [{"id": "abc", "body": "looks good", "author": "alice"}]

asyncio.run(main())

Multiple Tables and Joins

db = DirSQL(
    root,
    tables=[
        Table(
            ddl="CREATE TABLE posts (title TEXT, author_id TEXT)",
            glob="posts/*.json",
            extract=lambda path, content: [json.loads(content)],
        ),
        Table(
            ddl="CREATE TABLE authors (id TEXT, name TEXT)",
            glob="authors/*.json",
            extract=lambda path, content: [json.loads(content)],
        ),
    ],
)
await db.ready()

results = await db.query("""
    SELECT posts.title, authors.name
    FROM posts JOIN authors ON posts.author_id = authors.id
""")

Ignoring Files

Pass ignore patterns to skip files during scanning and watching:

db = DirSQL(
    root,
    ignore=["**/drafts/**", "**/.git/**"],
    tables=[...],
)

Watching for Changes

DirSQL is async by default. The watch() method returns an async iterator of row-level change events.

import asyncio
import json
from dirsql import DirSQL, Table

async def main():
    db = DirSQL(
        "/path/to/data",
        tables=[
            Table(
                ddl="CREATE TABLE items (name TEXT)",
                glob="**/*.json",
                extract=lambda path, content: [json.loads(content)],
            ),
        ],
    )
    await db.ready()

    # Query
    results = await db.query("SELECT * FROM items")

    # Watch for file changes (insert/update/delete/error events)
    async for event in db.watch():
        print(f"{event.action} on {event.table}: {event.row}")
        if event.action == "error":
            print(f"  error: {event.error}")

asyncio.run(main())

API Reference

Table(*, ddl, glob, extract)

Defines how files map to a SQL table.

  • ddl (str): A CREATE TABLE statement defining the schema.
  • glob (str): A glob pattern matched against file paths relative to root.
  • extract (Callable[[str, str], list[dict]]): A function receiving (relative_path, file_content) and returning a list of row dicts. Each dict's keys must match the DDL column names.

DirSQL(root=None, *, tables=None, ignore=None, config=None)

Creates an in-memory SQLite database indexed from the directory at root. The constructor is sync and returns immediately; scanning runs in a background thread.

At least one of root or config must be supplied. When both root and config are passed (or config declares [dirsql].root), the explicit root wins and a warning is emitted on stderr.

  • root (str | None): Path to the directory to index. Optional when config supplies one.
  • tables (list[Table] | None): Programmatic table definitions. Appended to any tables in the config file.
  • ignore (list[str] | None): Glob patterns for paths to skip. Appended to any [dirsql].ignore patterns in the config file.
  • config (str | None): Optional path to a .dirsql.toml file. Its [[table]] entries, [dirsql].ignore, and optional [dirsql].root are merged into the constructor's inputs.

await DirSQL.ready()

Wait for the initial scan to complete. Idempotent -- safe to call multiple times. Raises any exception that occurred during init.

await DirSQL.query(sql) -> list[dict]

Execute a SQL query. Returns a list of dicts keyed by column name. Internal tracking columns (_dirsql_*) are excluded from results.

DirSQL.watch() -> AsyncIterator[RowEvent]

Returns an async iterator that yields RowEvent objects as files change on disk. Starts the filesystem watcher on first iteration.

RowEvent

Emitted by watch() when a file change produces row-level diffs.

  • table (str): The affected table name.
  • action (str): One of "insert", "update", "delete", "error".
  • row (dict | None): The new row (for insert/update) or deleted row (for delete).
  • old_row (dict | None): The previous row (for update only).
  • error (str | None): Error message (for error events).
  • file_path (str | None): The relative file path that triggered the event.

How It Works

The Rust core (rusqlite + notify + walkdir) does the heavy lifting:

  1. Startup scan: Walks the directory tree, matches files to tables via glob patterns, calls the user-provided extract function for each file, and inserts rows into an in-memory SQLite database.
  2. File watching: Uses the notify crate (inotify on Linux, FSEvents on macOS) to detect file creates, modifications, and deletions.
  3. Row diffing: When a file changes, the new rows are diffed against the previous rows for that file, producing granular insert/update/delete events.
  4. Python bindings: PyO3 exposes the Rust core as a native Python extension module. The async layer runs blocking operations in a thread pool via asyncio.to_thread.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dirsql-0.3.6.tar.gz (239.1 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

dirsql-0.3.6-cp313-cp313-win_amd64.whl (5.0 MB view details)

Uploaded CPython 3.13Windows x86-64

dirsql-0.3.6-cp313-cp313-manylinux_2_34_x86_64.whl (6.0 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.34+ x86-64

dirsql-0.3.6-cp313-cp313-manylinux_2_34_aarch64.whl (5.9 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.34+ ARM64

dirsql-0.3.6-cp313-cp313-macosx_11_0_arm64.whl (5.2 MB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

dirsql-0.3.6-cp313-cp313-macosx_10_12_x86_64.whl (5.5 MB view details)

Uploaded CPython 3.13macOS 10.12+ x86-64

dirsql-0.3.6-cp312-cp312-win_amd64.whl (5.0 MB view details)

Uploaded CPython 3.12Windows x86-64

dirsql-0.3.6-cp312-cp312-manylinux_2_34_x86_64.whl (6.0 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.34+ x86-64

dirsql-0.3.6-cp312-cp312-manylinux_2_34_aarch64.whl (5.9 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.34+ ARM64

dirsql-0.3.6-cp312-cp312-macosx_11_0_arm64.whl (5.2 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

dirsql-0.3.6-cp312-cp312-macosx_10_12_x86_64.whl (5.5 MB view details)

Uploaded CPython 3.12macOS 10.12+ x86-64

dirsql-0.3.6-cp311-cp311-win_amd64.whl (5.0 MB view details)

Uploaded CPython 3.11Windows x86-64

dirsql-0.3.6-cp311-cp311-manylinux_2_34_x86_64.whl (6.0 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.34+ x86-64

dirsql-0.3.6-cp311-cp311-manylinux_2_34_aarch64.whl (5.9 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.34+ ARM64

dirsql-0.3.6-cp311-cp311-macosx_11_0_arm64.whl (5.2 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

dirsql-0.3.6-cp311-cp311-macosx_10_12_x86_64.whl (5.5 MB view details)

Uploaded CPython 3.11macOS 10.12+ x86-64

File details

Details for the file dirsql-0.3.6.tar.gz.

File metadata

  • Download URL: dirsql-0.3.6.tar.gz
  • Upload date:
  • Size: 239.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dirsql-0.3.6.tar.gz
Algorithm Hash digest
SHA256 968acb16fe02743cfce578c80d3f950780c340a4db9335d5db0f4a2dced1aefb
MD5 271989fa9b14241221aa4317e8434e07
BLAKE2b-256 c2be0bdc1cce624073ad128ef7a794d5ff685bcafdb15e1afff57989b917af4f

See more details on using hashes here.

Provenance

The following attestation bundles were made for dirsql-0.3.6.tar.gz:

Publisher: release.yml on thekevinscott/dirsql

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dirsql-0.3.6-cp313-cp313-win_amd64.whl.

File metadata

  • Download URL: dirsql-0.3.6-cp313-cp313-win_amd64.whl
  • Upload date:
  • Size: 5.0 MB
  • Tags: CPython 3.13, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dirsql-0.3.6-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 e789ec76687e6a09405222832be3f10b90dd93b409c936f45c5dd767fd6eee43
MD5 e6c29250687c8ac8ef435b926f70d262
BLAKE2b-256 8a68ab06ba6894e41b704c473ab522eed900502d8386bbcba92a35aaee68c4c8

See more details on using hashes here.

Provenance

The following attestation bundles were made for dirsql-0.3.6-cp313-cp313-win_amd64.whl:

Publisher: release.yml on thekevinscott/dirsql

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dirsql-0.3.6-cp313-cp313-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for dirsql-0.3.6-cp313-cp313-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 cb975128a1b58793e285e55fd3a2c3d33e256773adf86e660535b142364bb7c9
MD5 1751b8190fe89532a8d97aa2c291e50d
BLAKE2b-256 f97dff5061779c20763a80fff4d6525fd8726e248fe6b3da230350e46a7ac9f6

See more details on using hashes here.

Provenance

The following attestation bundles were made for dirsql-0.3.6-cp313-cp313-manylinux_2_34_x86_64.whl:

Publisher: release.yml on thekevinscott/dirsql

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dirsql-0.3.6-cp313-cp313-manylinux_2_34_aarch64.whl.

File metadata

File hashes

Hashes for dirsql-0.3.6-cp313-cp313-manylinux_2_34_aarch64.whl
Algorithm Hash digest
SHA256 ca274f9ae772309b3bc343f114874422c3bf98201d944303fa7894f5b9ca847e
MD5 9ea871cd65d1b1cc1ec26f5c9540ee95
BLAKE2b-256 a3f96c3d216b8676d46f9b3b851cf74559366c33219ac573823c07e9bb0aab94

See more details on using hashes here.

Provenance

The following attestation bundles were made for dirsql-0.3.6-cp313-cp313-manylinux_2_34_aarch64.whl:

Publisher: release.yml on thekevinscott/dirsql

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dirsql-0.3.6-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dirsql-0.3.6-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 c6beed6e2e01481da7aef39e6671bbbc3d846cb11afa8ba11f72f294fa2a58ad
MD5 7844bafc068062b320041625e91cc5cc
BLAKE2b-256 1991e52b62ced1d5442615a74e5a905aa344d6f05c197ddc0fccbaf40d2eaaa5

See more details on using hashes here.

Provenance

The following attestation bundles were made for dirsql-0.3.6-cp313-cp313-macosx_11_0_arm64.whl:

Publisher: release.yml on thekevinscott/dirsql

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dirsql-0.3.6-cp313-cp313-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for dirsql-0.3.6-cp313-cp313-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 98a0f1d6a41cc74bb57ffa378e4cc095a94ea2c5d1d0dbbfa564ccb0c7cfbfa6
MD5 d1d8ea05fce662566ef6c613a572638a
BLAKE2b-256 e01462df95a167966324dac60c260aaadfb553632bfea27596ae33526921316e

See more details on using hashes here.

Provenance

The following attestation bundles were made for dirsql-0.3.6-cp313-cp313-macosx_10_12_x86_64.whl:

Publisher: release.yml on thekevinscott/dirsql

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dirsql-0.3.6-cp312-cp312-win_amd64.whl.

File metadata

  • Download URL: dirsql-0.3.6-cp312-cp312-win_amd64.whl
  • Upload date:
  • Size: 5.0 MB
  • Tags: CPython 3.12, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dirsql-0.3.6-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 61a26ee1d5f1607870d1f088e7bf51b6be0a03e4a481bdcba7a16c769557ee5c
MD5 1cfc683794a3eed5f0b3dc10e856c79b
BLAKE2b-256 d8c442e52f8cf12a967955499b5c305f983e546bf6b7d8f84c860eb4659247e6

See more details on using hashes here.

Provenance

The following attestation bundles were made for dirsql-0.3.6-cp312-cp312-win_amd64.whl:

Publisher: release.yml on thekevinscott/dirsql

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dirsql-0.3.6-cp312-cp312-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for dirsql-0.3.6-cp312-cp312-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 f8712347c83f438ed560eb561f767fffbd3ef299429cfa2e1d685369c0d99207
MD5 c85a291e6f52b89695ee30302da6c128
BLAKE2b-256 c23e0df60fe23b743f71c212bb628d79840a21ac36430e6f5f80da4b5b6b2956

See more details on using hashes here.

Provenance

The following attestation bundles were made for dirsql-0.3.6-cp312-cp312-manylinux_2_34_x86_64.whl:

Publisher: release.yml on thekevinscott/dirsql

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dirsql-0.3.6-cp312-cp312-manylinux_2_34_aarch64.whl.

File metadata

File hashes

Hashes for dirsql-0.3.6-cp312-cp312-manylinux_2_34_aarch64.whl
Algorithm Hash digest
SHA256 63fc7c60f5e319930b3aa9a9c2691330a088321d001dc9914b8a918c1e310ca2
MD5 2bb725367bce4df6b7bc2d3dc5aadeaf
BLAKE2b-256 cb6ea96ac3680cc525d667ccd55c719a9a51e1ab93df9cef4265989ac82aa3bb

See more details on using hashes here.

Provenance

The following attestation bundles were made for dirsql-0.3.6-cp312-cp312-manylinux_2_34_aarch64.whl:

Publisher: release.yml on thekevinscott/dirsql

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dirsql-0.3.6-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dirsql-0.3.6-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 911d52669084d045045b2363d448132fccb030dc1f3798f4ddc324f3f7d35c99
MD5 a83cd095f193739c5acba60903dbccf0
BLAKE2b-256 df80720ceac3a57b289a73959258b7839830d0ef377f01e498b9423d30785ef0

See more details on using hashes here.

Provenance

The following attestation bundles were made for dirsql-0.3.6-cp312-cp312-macosx_11_0_arm64.whl:

Publisher: release.yml on thekevinscott/dirsql

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dirsql-0.3.6-cp312-cp312-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for dirsql-0.3.6-cp312-cp312-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 93540de5070c04838d8dabb3c96e656f58bb0055858cd6ecb67ba2174aba6b15
MD5 813f876ac07a084dbb517cd1691c1383
BLAKE2b-256 445e8770d78e9a83f0630ec68f8854c4bd7ff4d6a5743b3ee2aea3f3babdf58a

See more details on using hashes here.

Provenance

The following attestation bundles were made for dirsql-0.3.6-cp312-cp312-macosx_10_12_x86_64.whl:

Publisher: release.yml on thekevinscott/dirsql

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dirsql-0.3.6-cp311-cp311-win_amd64.whl.

File metadata

  • Download URL: dirsql-0.3.6-cp311-cp311-win_amd64.whl
  • Upload date:
  • Size: 5.0 MB
  • Tags: CPython 3.11, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dirsql-0.3.6-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 02e36dea1a7bb7a6636c75d9c2c53850703f079c658a0d348eef852074536ed1
MD5 11b88916abed49ae662bc29c15fc16b6
BLAKE2b-256 d66ac170d8b6f212fc5eed6c4f68eb85db846a9441d36b1d76930e1e30998423

See more details on using hashes here.

Provenance

The following attestation bundles were made for dirsql-0.3.6-cp311-cp311-win_amd64.whl:

Publisher: release.yml on thekevinscott/dirsql

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dirsql-0.3.6-cp311-cp311-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for dirsql-0.3.6-cp311-cp311-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 6e64c6f27c38c23a05a11b47ca9a83c38882bc3b7cd3cb3dfb8670c630729f70
MD5 c960ccf34ce4071383339806e4f624c7
BLAKE2b-256 d4aca4825c888c86f1740083f571b64f67c3d5b0e12e7d996703f39bf0b2d644

See more details on using hashes here.

Provenance

The following attestation bundles were made for dirsql-0.3.6-cp311-cp311-manylinux_2_34_x86_64.whl:

Publisher: release.yml on thekevinscott/dirsql

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dirsql-0.3.6-cp311-cp311-manylinux_2_34_aarch64.whl.

File metadata

File hashes

Hashes for dirsql-0.3.6-cp311-cp311-manylinux_2_34_aarch64.whl
Algorithm Hash digest
SHA256 e2feb54bfabf8a31b4a27d74320b869a9ddf8e4a3d9deee0a06e208422607a8b
MD5 66aabcff42922b99ff92cea81cd5eb29
BLAKE2b-256 30b978a60e9d3e9650095c51ce1805236f6ff3fcb175cf6b3319e93586e8b557

See more details on using hashes here.

Provenance

The following attestation bundles were made for dirsql-0.3.6-cp311-cp311-manylinux_2_34_aarch64.whl:

Publisher: release.yml on thekevinscott/dirsql

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dirsql-0.3.6-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dirsql-0.3.6-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 f9f4aeadf8a56cf402341c95b852efdfce66d4fe88e93f8c1c72cea9922f384e
MD5 c5c6dd675f2ce825dd00e6838efe8f6c
BLAKE2b-256 d3a72d34f4d3fdfd5ba8abac8d359b5df9e18e0bc095df2983ef5c53e371871c

See more details on using hashes here.

Provenance

The following attestation bundles were made for dirsql-0.3.6-cp311-cp311-macosx_11_0_arm64.whl:

Publisher: release.yml on thekevinscott/dirsql

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dirsql-0.3.6-cp311-cp311-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for dirsql-0.3.6-cp311-cp311-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 69ea7354c6931ca4f250f26956bf1a02edc8c15905a435a47e039d9286899a5c
MD5 e5aa76c05aeed285963e212e150c4ddc
BLAKE2b-256 279acdf8829c1b0bf404a199999e423d0aa93150a8de362d367336568886c54c

See more details on using hashes here.

Provenance

The following attestation bundles were made for dirsql-0.3.6-cp311-cp311-macosx_10_12_x86_64.whl:

Publisher: release.yml on thekevinscott/dirsql

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page