Skip to main content

Ephemeral SQL index over a local directory

Project description

dirsql (Python SDK)

Ephemeral SQL index over a local directory. Watches a filesystem, ingests structured files into an in-memory SQLite database, and exposes a SQL query interface. The database is purely in-memory -- the filesystem is always the source of truth.

Documentation

Installation

pip install dirsql

Requires Python >= 3.12. Ships as a native extension (Rust via PyO3) -- binary wheels are provided for common platforms.

Each wheel also bundles the dirsql HTTP-server CLI as a console script, so pip install dirsql also gives you a dirsql command on $PATH. See the CLI guide.

Publishing (maintainers)

Handled by .github/workflows/publish.yml (invoked from minor-release.yml / patch-release.yml). For each target triple the build job cargo builds the Rust CLI with --features cli, stages the binary into python/dirsql/_binary/, runs maturin build (which picks the binary up via the [tool.maturin] include rule in pyproject.toml), and the wheels + sdist are then trusted-published to PyPI.

Quick Start

import asyncio
import json
import os
import tempfile
from dirsql import DirSQL, Table

async def main():
    # Create some data files
    root = tempfile.mkdtemp()
    os.makedirs(os.path.join(root, "comments", "abc"), exist_ok=True)
    os.makedirs(os.path.join(root, "comments", "def"), exist_ok=True)

    with open(os.path.join(root, "comments", "abc", "index.jsonl"), "w") as f:
        f.write(json.dumps({"body": "looks good", "author": "alice"}) + "\n")
        f.write(json.dumps({"body": "needs work", "author": "bob"}) + "\n")

    with open(os.path.join(root, "comments", "def", "index.jsonl"), "w") as f:
        f.write(json.dumps({"body": "agreed", "author": "carol"}) + "\n")

    # Define a table: DDL, glob pattern, and an extract function
    db = DirSQL(
        root,
        tables=[
            Table(
                ddl="CREATE TABLE comments (id TEXT, body TEXT, author TEXT)",
                glob="comments/**/index.jsonl",
                extract=lambda path, content: [
                    {
                        "id": os.path.basename(os.path.dirname(path)),
                        "body": row["body"],
                        "author": row["author"],
                    }
                    for line in content.splitlines()
                    for row in [json.loads(line)]
                ],
            ),
        ],
    )
    await db.ready()

    # Query with SQL
    results = await db.query("SELECT * FROM comments WHERE author = 'alice'")
    # [{"id": "abc", "body": "looks good", "author": "alice"}]

asyncio.run(main())

Multiple Tables and Joins

db = DirSQL(
    root,
    tables=[
        Table(
            ddl="CREATE TABLE posts (title TEXT, author_id TEXT)",
            glob="posts/*.json",
            extract=lambda path, content: [json.loads(content)],
        ),
        Table(
            ddl="CREATE TABLE authors (id TEXT, name TEXT)",
            glob="authors/*.json",
            extract=lambda path, content: [json.loads(content)],
        ),
    ],
)
await db.ready()

results = await db.query("""
    SELECT posts.title, authors.name
    FROM posts JOIN authors ON posts.author_id = authors.id
""")

Ignoring Files

Pass ignore patterns to skip files during scanning and watching:

db = DirSQL(
    root,
    ignore=["**/drafts/**", "**/.git/**"],
    tables=[...],
)

Watching for Changes

DirSQL is async by default. The watch() method returns an async iterator of row-level change events.

import asyncio
import json
from dirsql import DirSQL, Table

async def main():
    db = DirSQL(
        "/path/to/data",
        tables=[
            Table(
                ddl="CREATE TABLE items (name TEXT)",
                glob="**/*.json",
                extract=lambda path, content: [json.loads(content)],
            ),
        ],
    )
    await db.ready()

    # Query
    results = await db.query("SELECT * FROM items")

    # Watch for file changes (insert/update/delete/error events)
    async for event in db.watch():
        print(f"{event.action} on {event.table}: {event.row}")
        if event.action == "error":
            print(f"  error: {event.error}")

asyncio.run(main())

API Reference

Table(*, ddl, glob, extract)

Defines how files map to a SQL table.

  • ddl (str): A CREATE TABLE statement defining the schema.
  • glob (str): A glob pattern matched against file paths relative to root.
  • extract (Callable[[str, str], list[dict]]): A function receiving (relative_path, file_content) and returning a list of row dicts. Each dict's keys must match the DDL column names.

DirSQL(root, *, tables, ignore=None)

Creates an in-memory SQLite database indexed from the directory at root. The constructor is sync and returns immediately; scanning runs in a background thread.

  • root (str): Path to the directory to index.
  • tables (list[Table]): Table definitions.
  • ignore (list[str] | None): Glob patterns for paths to skip.

await DirSQL.ready()

Wait for the initial scan to complete. Idempotent -- safe to call multiple times. Raises any exception that occurred during init.

await DirSQL.query(sql) -> list[dict]

Execute a SQL query. Returns a list of dicts keyed by column name. Internal tracking columns (_dirsql_*) are excluded from results.

DirSQL.watch() -> AsyncIterator[RowEvent]

Returns an async iterator that yields RowEvent objects as files change on disk. Starts the filesystem watcher on first iteration.

DirSQL.from_config(path) -> DirSQL

Create a DirSQL instance from a .dirsql.toml config file. Returns immediately; scanning runs in the background. Call await db.ready() before querying.

RowEvent

Emitted by watch() when a file change produces row-level diffs.

  • table (str): The affected table name.
  • action (str): One of "insert", "update", "delete", "error".
  • row (dict | None): The new row (for insert/update) or deleted row (for delete).
  • old_row (dict | None): The previous row (for update only).
  • error (str | None): Error message (for error events).
  • file_path (str | None): The relative file path that triggered the event.

How It Works

The Rust core (rusqlite + notify + walkdir) does the heavy lifting:

  1. Startup scan: Walks the directory tree, matches files to tables via glob patterns, calls the user-provided extract function for each file, and inserts rows into an in-memory SQLite database.
  2. File watching: Uses the notify crate (inotify on Linux, FSEvents on macOS) to detect file creates, modifications, and deletions.
  3. Row diffing: When a file changes, the new rows are diffed against the previous rows for that file, producing granular insert/update/delete events.
  4. Python bindings: PyO3 exposes the Rust core as a native Python extension module. The async layer runs blocking operations in a thread pool via asyncio.to_thread.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dirsql-0.2.1.tar.gz (104.1 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

dirsql-0.2.1-cp313-cp313-win_amd64.whl (5.1 MB view details)

Uploaded CPython 3.13Windows x86-64

dirsql-0.2.1-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (6.1 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ x86-64

dirsql-0.2.1-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (6.1 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ ARM64

dirsql-0.2.1-cp313-cp313-macosx_11_0_arm64.whl (5.5 MB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

dirsql-0.2.1-cp313-cp313-macosx_10_12_x86_64.whl (5.7 MB view details)

Uploaded CPython 3.13macOS 10.12+ x86-64

dirsql-0.2.1-cp312-cp312-win_amd64.whl (5.1 MB view details)

Uploaded CPython 3.12Windows x86-64

dirsql-0.2.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (6.1 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

dirsql-0.2.1-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (6.1 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ ARM64

dirsql-0.2.1-cp312-cp312-macosx_11_0_arm64.whl (5.5 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

dirsql-0.2.1-cp312-cp312-macosx_10_12_x86_64.whl (5.7 MB view details)

Uploaded CPython 3.12macOS 10.12+ x86-64

dirsql-0.2.1-cp311-cp311-win_amd64.whl (5.1 MB view details)

Uploaded CPython 3.11Windows x86-64

dirsql-0.2.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (6.1 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

dirsql-0.2.1-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (6.1 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ ARM64

dirsql-0.2.1-cp311-cp311-macosx_11_0_arm64.whl (5.5 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

dirsql-0.2.1-cp311-cp311-macosx_10_12_x86_64.whl (5.7 MB view details)

Uploaded CPython 3.11macOS 10.12+ x86-64

dirsql-0.2.1-cp310-cp310-win_amd64.whl (5.1 MB view details)

Uploaded CPython 3.10Windows x86-64

dirsql-0.2.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (6.1 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

dirsql-0.2.1-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (6.1 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ ARM64

dirsql-0.2.1-cp310-cp310-macosx_11_0_arm64.whl (5.5 MB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

dirsql-0.2.1-cp310-cp310-macosx_10_12_x86_64.whl (5.7 MB view details)

Uploaded CPython 3.10macOS 10.12+ x86-64

File details

Details for the file dirsql-0.2.1.tar.gz.

File metadata

  • Download URL: dirsql-0.2.1.tar.gz
  • Upload date:
  • Size: 104.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dirsql-0.2.1.tar.gz
Algorithm Hash digest
SHA256 ec4a956893a728d119aed92e7e4c8c8652ee9d4eab110614d15747d4fe8e253e
MD5 a2b89c95f129832bce24d8f5d1afbd47
BLAKE2b-256 fdac3e98f4585af3c2d5cfd68ffb0897376fa46c1aa446c4dcffeac623e69f2b

See more details on using hashes here.

File details

Details for the file dirsql-0.2.1-cp313-cp313-win_amd64.whl.

File metadata

  • Download URL: dirsql-0.2.1-cp313-cp313-win_amd64.whl
  • Upload date:
  • Size: 5.1 MB
  • Tags: CPython 3.13, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dirsql-0.2.1-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 1073e756fe3c378290e96e8486d564a19b1d096cd09a07a18de5ef7e73ed5765
MD5 ad2f1aba08fe629d0d79187935d45ab4
BLAKE2b-256 ce37fdd0be0fb9e059aa17edffa509165e22e0a01ef41de0b430e7c6bd383444

See more details on using hashes here.

File details

Details for the file dirsql-0.2.1-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dirsql-0.2.1-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 28626b5e890c05e9b96e089a675b5b897facf244a21cb28051166e6f23109ae1
MD5 f5ff95598537ff021aa3480b63815094
BLAKE2b-256 f21105641f3483634a886b1ae3aeba1a381212cf2e307797d2174419ceeee1e5

See more details on using hashes here.

File details

Details for the file dirsql-0.2.1-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for dirsql-0.2.1-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 ab211ac6adb6e3d33fb433cf1b58ff441be702d321f3291c27bd1e0be0a7eeaf
MD5 21d611200653e592f0072d11d1bac27b
BLAKE2b-256 20d95186e17e6c8864cc774b67f5fb714af299da647e09ad7f63f4012759ed04

See more details on using hashes here.

File details

Details for the file dirsql-0.2.1-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dirsql-0.2.1-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 11438ffcc464b2dea5823ff2cac44357bae7976c7141e683cadcebda6f507eca
MD5 a9fa15e1a3474444db662e1f5a63f1b6
BLAKE2b-256 888debbb78df63a51a96a80cdda18717b4a4516f0d14ce4e3eb7f84a43a36af0

See more details on using hashes here.

File details

Details for the file dirsql-0.2.1-cp313-cp313-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for dirsql-0.2.1-cp313-cp313-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 6ba20565202662d3e72c5bb9d67dc6ecabc98dbeb01734d4f4853088c360059c
MD5 aa6ccedaa7b569e9c827bbbdbc5dbad6
BLAKE2b-256 ab3d4e672d8ce6836fc771e4efc52fe71953c61225c9db60ef35851f21554b2d

See more details on using hashes here.

File details

Details for the file dirsql-0.2.1-cp312-cp312-win_amd64.whl.

File metadata

  • Download URL: dirsql-0.2.1-cp312-cp312-win_amd64.whl
  • Upload date:
  • Size: 5.1 MB
  • Tags: CPython 3.12, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dirsql-0.2.1-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 aa4642be2402d52cba6524c83ac0746e7e953e8b3f6cf6f830527a68bd8ef15b
MD5 7796b759189197c7678b6d6cf98c6004
BLAKE2b-256 8f0e98dd9fd028247dd90751664934b5edaaf50d9cdb6f355598df027e752fa2

See more details on using hashes here.

File details

Details for the file dirsql-0.2.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dirsql-0.2.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 b9cc069eb2a9d2b0ed2345a903ed73225bf8cc5cdee87522242ebcd823bc47d1
MD5 21d5f14d4a4e4a602424c9a615d51529
BLAKE2b-256 cb41edc0c767a8317aad3a5833bcb9457e91788052edeb615d2ffec05faa48c4

See more details on using hashes here.

File details

Details for the file dirsql-0.2.1-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for dirsql-0.2.1-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 cb81e2533005c101ee4d50e89fc12fb27c8da5b9e8a316a4da382ffd4c766552
MD5 51bd59968bfb911fec672d6c37cbc5a9
BLAKE2b-256 fae3259346de0aa34b6e02163c029e3e33d0a6177b07705315ca2960082446b4

See more details on using hashes here.

File details

Details for the file dirsql-0.2.1-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dirsql-0.2.1-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 5a2f34a32a20a3eb707c128b744ebae36262727888b14fac5f46f85d1449062a
MD5 4672d87f4c34bf9dd041ca762a91d90c
BLAKE2b-256 983bd84dadec32e3a292993101f0d62189237b1feefd4728c9d140e1b0a3260a

See more details on using hashes here.

File details

Details for the file dirsql-0.2.1-cp312-cp312-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for dirsql-0.2.1-cp312-cp312-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 28c2bd0177d9f31fc2aca62702b2b3be07b170891328c44f862226eedb6e2e63
MD5 1aaf03fcf8366d194f879c047fb3bad6
BLAKE2b-256 f3c7d7828fbd054ca464e1024366e5d7bd53d0600847deb5a568c117fd2de526

See more details on using hashes here.

File details

Details for the file dirsql-0.2.1-cp311-cp311-win_amd64.whl.

File metadata

  • Download URL: dirsql-0.2.1-cp311-cp311-win_amd64.whl
  • Upload date:
  • Size: 5.1 MB
  • Tags: CPython 3.11, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dirsql-0.2.1-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 c466036718c48760d3bf2350b6c307eba50092373dc67113976221159170f127
MD5 744a0d34d5a4161815537858fe871c59
BLAKE2b-256 a4752d031909573f2cfe4401acf4cfa820046389a2ceb0e6dcf4eb344114bd6a

See more details on using hashes here.

File details

Details for the file dirsql-0.2.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dirsql-0.2.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 7354a5eebbdfe4f2e7af933c81fda2c904d0d87a19aeb02dbb61a1c98ec7de02
MD5 fbdbb57108cc8d034228868f69b3087d
BLAKE2b-256 49ffa361665202c1f7f570a3587b2bb5fc32c345067273133d4f3854985f8231

See more details on using hashes here.

File details

Details for the file dirsql-0.2.1-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for dirsql-0.2.1-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 11e1bba1d2f8b5e33c6e40cfca3cf929d7227bf7ba695e12a5d08d74871b729d
MD5 e86bef5bf0149d61a8f5a83f81e11c9b
BLAKE2b-256 2ebe3fe67762430af42b647ed0d364a2d80f7ab5021eb77e4b117261ec390229

See more details on using hashes here.

File details

Details for the file dirsql-0.2.1-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dirsql-0.2.1-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 097a3d5df8fc1a212ca1f9fc3c8c5aaa3ac0ff392e0a73bb14c1e81a5781b549
MD5 e17cde84b4c61d1ce3fbd4f644c57329
BLAKE2b-256 72e58ecbfb3cc0a2aa44f5a00ec662a66296b5ff996433fb4a13e35b0eba7d64

See more details on using hashes here.

File details

Details for the file dirsql-0.2.1-cp311-cp311-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for dirsql-0.2.1-cp311-cp311-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 20846b17e486c438401f5e5c4de04ee53061ab4f8a7bf86ef804104e5b9df24a
MD5 1180644cc10e387d3f8057c72e0e00c1
BLAKE2b-256 7aff30bcd5f1ba668b4950cf9e0c0d9ce21ec999b6db6cc336519aee1768e0f7

See more details on using hashes here.

File details

Details for the file dirsql-0.2.1-cp310-cp310-win_amd64.whl.

File metadata

  • Download URL: dirsql-0.2.1-cp310-cp310-win_amd64.whl
  • Upload date:
  • Size: 5.1 MB
  • Tags: CPython 3.10, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dirsql-0.2.1-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 0e0a644b68d0ad376cb101de7531d8aaeebd55dc36aaa08f67a02afa089b1d57
MD5 0232f2fca0c0507a8e39e8ac35d1301d
BLAKE2b-256 1080e5278a08bb7ed2a311fd97186452f400a788dbd0bd84f3747558caa30541

See more details on using hashes here.

File details

Details for the file dirsql-0.2.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dirsql-0.2.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 6d611ab3436df6dd531b251927bd754a4d94c5143ff872422398ffa307800563
MD5 8a1c887ae968aa305922c592a145d098
BLAKE2b-256 1fadde097944601704d210b5f4d0826abcc59d9a133fc72996d85d92e09476b1

See more details on using hashes here.

File details

Details for the file dirsql-0.2.1-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for dirsql-0.2.1-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 1a01b9b6d45340c8a8b8ab28640255eb2f0159b2b30409514f16c028e4f4f36d
MD5 a32eccaeb518d0701b489e97eff629ab
BLAKE2b-256 25a66509959ae4bcad885befe792a81a983317b4f22e79691de30876e619a435

See more details on using hashes here.

File details

Details for the file dirsql-0.2.1-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dirsql-0.2.1-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 0b5fbcb7e85c79517630d5e4c5870e866137d567e7ab7f038b0327b4ca7d2058
MD5 d714f9611d6e0fa387eb0a9148ab1376
BLAKE2b-256 fd4264260e19ee40375526177fb105a88cb4818bffbf7911c9ee715aadb348cd

See more details on using hashes here.

File details

Details for the file dirsql-0.2.1-cp310-cp310-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for dirsql-0.2.1-cp310-cp310-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 151fbc0223eb2b655301118141004f4d6a4846924e44d847758adc9293a6944f
MD5 882446f9afb2dc3805052d55f338b9d5
BLAKE2b-256 333b955ef6a82bf57f2286f0c28f5f09fa006e219a4c49efa9ed613cd945dceb

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page