Skip to main content

Ephemeral SQL index over a local directory

Project description

dirsql (Python SDK)

Ephemeral SQL index over a local directory. Watches a filesystem, ingests structured files into an in-memory SQLite database, and exposes a SQL query interface. The database is purely in-memory -- the filesystem is always the source of truth.

Documentation

Installation

pip install dirsql

Requires Python >= 3.12. Ships as a native extension (Rust via PyO3) -- binary wheels are provided for common platforms.

Each wheel also bundles the dirsql HTTP-server CLI as a console script, so pip install dirsql also gives you a dirsql command on $PATH. See the CLI guide.

Publishing (maintainers)

Handled by .github/workflows/publish.yml (invoked from minor-release.yml / patch-release.yml). For each target triple the build job cargo builds the Rust CLI with --features cli, stages the binary into python/dirsql/_binary/, runs maturin build (which picks the binary up via the [tool.maturin] include rule in pyproject.toml), and the wheels + sdist are then trusted-published to PyPI.

Quick Start

import asyncio
import json
import os
import tempfile
from dirsql import DirSQL, Table

async def main():
    # Create some data files
    root = tempfile.mkdtemp()
    os.makedirs(os.path.join(root, "comments", "abc"), exist_ok=True)
    os.makedirs(os.path.join(root, "comments", "def"), exist_ok=True)

    with open(os.path.join(root, "comments", "abc", "index.jsonl"), "w") as f:
        f.write(json.dumps({"body": "looks good", "author": "alice"}) + "\n")
        f.write(json.dumps({"body": "needs work", "author": "bob"}) + "\n")

    with open(os.path.join(root, "comments", "def", "index.jsonl"), "w") as f:
        f.write(json.dumps({"body": "agreed", "author": "carol"}) + "\n")

    # Define a table: DDL, glob pattern, and an extract function
    db = DirSQL(
        root,
        tables=[
            Table(
                ddl="CREATE TABLE comments (id TEXT, body TEXT, author TEXT)",
                glob="comments/**/index.jsonl",
                extract=lambda path, content: [
                    {
                        "id": os.path.basename(os.path.dirname(path)),
                        "body": row["body"],
                        "author": row["author"],
                    }
                    for line in content.splitlines()
                    for row in [json.loads(line)]
                ],
            ),
        ],
    )
    await db.ready()

    # Query with SQL
    results = await db.query("SELECT * FROM comments WHERE author = 'alice'")
    # [{"id": "abc", "body": "looks good", "author": "alice"}]

asyncio.run(main())

Multiple Tables and Joins

db = DirSQL(
    root,
    tables=[
        Table(
            ddl="CREATE TABLE posts (title TEXT, author_id TEXT)",
            glob="posts/*.json",
            extract=lambda path, content: [json.loads(content)],
        ),
        Table(
            ddl="CREATE TABLE authors (id TEXT, name TEXT)",
            glob="authors/*.json",
            extract=lambda path, content: [json.loads(content)],
        ),
    ],
)
await db.ready()

results = await db.query("""
    SELECT posts.title, authors.name
    FROM posts JOIN authors ON posts.author_id = authors.id
""")

Ignoring Files

Pass ignore patterns to skip files during scanning and watching:

db = DirSQL(
    root,
    ignore=["**/drafts/**", "**/.git/**"],
    tables=[...],
)

Watching for Changes

DirSQL is async by default. The watch() method returns an async iterator of row-level change events.

import asyncio
import json
from dirsql import DirSQL, Table

async def main():
    db = DirSQL(
        "/path/to/data",
        tables=[
            Table(
                ddl="CREATE TABLE items (name TEXT)",
                glob="**/*.json",
                extract=lambda path, content: [json.loads(content)],
            ),
        ],
    )
    await db.ready()

    # Query
    results = await db.query("SELECT * FROM items")

    # Watch for file changes (insert/update/delete/error events)
    async for event in db.watch():
        print(f"{event.action} on {event.table}: {event.row}")
        if event.action == "error":
            print(f"  error: {event.error}")

asyncio.run(main())

API Reference

Table(*, ddl, glob, extract)

Defines how files map to a SQL table.

  • ddl (str): A CREATE TABLE statement defining the schema.
  • glob (str): A glob pattern matched against file paths relative to root.
  • extract (Callable[[str, str], list[dict]]): A function receiving (relative_path, file_content) and returning a list of row dicts. Each dict's keys must match the DDL column names.

DirSQL(root=None, *, tables=None, ignore=None, config=None)

Creates an in-memory SQLite database indexed from the directory at root. The constructor is sync and returns immediately; scanning runs in a background thread.

At least one of root or config must be supplied. When both root and config are passed (or config declares [dirsql].root), the explicit root wins and a warning is emitted on stderr.

  • root (str | None): Path to the directory to index. Optional when config supplies one.
  • tables (list[Table] | None): Programmatic table definitions. Appended to any tables in the config file.
  • ignore (list[str] | None): Glob patterns for paths to skip. Appended to any [dirsql].ignore patterns in the config file.
  • config (str | None): Optional path to a .dirsql.toml file. Its [[table]] entries, [dirsql].ignore, and optional [dirsql].root are merged into the constructor's inputs.

await DirSQL.ready()

Wait for the initial scan to complete. Idempotent -- safe to call multiple times. Raises any exception that occurred during init.

await DirSQL.query(sql) -> list[dict]

Execute a SQL query. Returns a list of dicts keyed by column name. Internal tracking columns (_dirsql_*) are excluded from results.

DirSQL.watch() -> AsyncIterator[RowEvent]

Returns an async iterator that yields RowEvent objects as files change on disk. Starts the filesystem watcher on first iteration.

RowEvent

Emitted by watch() when a file change produces row-level diffs.

  • table (str): The affected table name.
  • action (str): One of "insert", "update", "delete", "error".
  • row (dict | None): The new row (for insert/update) or deleted row (for delete).
  • old_row (dict | None): The previous row (for update only).
  • error (str | None): Error message (for error events).
  • file_path (str | None): The relative file path that triggered the event.

How It Works

The Rust core (rusqlite + notify + walkdir) does the heavy lifting:

  1. Startup scan: Walks the directory tree, matches files to tables via glob patterns, calls the user-provided extract function for each file, and inserts rows into an in-memory SQLite database.
  2. File watching: Uses the notify crate (inotify on Linux, FSEvents on macOS) to detect file creates, modifications, and deletions.
  3. Row diffing: When a file changes, the new rows are diffed against the previous rows for that file, producing granular insert/update/delete events.
  4. Python bindings: PyO3 exposes the Rust core as a native Python extension module. The async layer runs blocking operations in a thread pool via asyncio.to_thread.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dirsql-0.2.3.tar.gz (108.0 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

dirsql-0.2.3-cp313-cp313-win_amd64.whl (5.2 MB view details)

Uploaded CPython 3.13Windows x86-64

dirsql-0.2.3-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (6.1 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ x86-64

dirsql-0.2.3-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (6.1 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ ARM64

dirsql-0.2.3-cp313-cp313-macosx_11_0_arm64.whl (5.5 MB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

dirsql-0.2.3-cp313-cp313-macosx_10_12_x86_64.whl (5.7 MB view details)

Uploaded CPython 3.13macOS 10.12+ x86-64

dirsql-0.2.3-cp312-cp312-win_amd64.whl (5.2 MB view details)

Uploaded CPython 3.12Windows x86-64

dirsql-0.2.3-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (6.2 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

dirsql-0.2.3-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (6.1 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ ARM64

dirsql-0.2.3-cp312-cp312-macosx_11_0_arm64.whl (5.5 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

dirsql-0.2.3-cp312-cp312-macosx_10_12_x86_64.whl (5.7 MB view details)

Uploaded CPython 3.12macOS 10.12+ x86-64

dirsql-0.2.3-cp311-cp311-win_amd64.whl (5.2 MB view details)

Uploaded CPython 3.11Windows x86-64

dirsql-0.2.3-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (6.2 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

dirsql-0.2.3-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (6.1 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ ARM64

dirsql-0.2.3-cp311-cp311-macosx_11_0_arm64.whl (5.5 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

dirsql-0.2.3-cp311-cp311-macosx_10_12_x86_64.whl (5.7 MB view details)

Uploaded CPython 3.11macOS 10.12+ x86-64

dirsql-0.2.3-cp310-cp310-win_amd64.whl (5.2 MB view details)

Uploaded CPython 3.10Windows x86-64

dirsql-0.2.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (6.2 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

dirsql-0.2.3-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (6.1 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ ARM64

dirsql-0.2.3-cp310-cp310-macosx_11_0_arm64.whl (5.5 MB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

dirsql-0.2.3-cp310-cp310-macosx_10_12_x86_64.whl (5.7 MB view details)

Uploaded CPython 3.10macOS 10.12+ x86-64

File details

Details for the file dirsql-0.2.3.tar.gz.

File metadata

  • Download URL: dirsql-0.2.3.tar.gz
  • Upload date:
  • Size: 108.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dirsql-0.2.3.tar.gz
Algorithm Hash digest
SHA256 1c73861b25c58cca4e8c01c9a7714f193cbfd851fb7fb386d53be604112433c0
MD5 f47c5c1fdc60c59d9075add1126b520b
BLAKE2b-256 75d753a426ffce8858f6259ebff4e75180d2f0ad31049482e408d814ab37e1e1

See more details on using hashes here.

File details

Details for the file dirsql-0.2.3-cp313-cp313-win_amd64.whl.

File metadata

  • Download URL: dirsql-0.2.3-cp313-cp313-win_amd64.whl
  • Upload date:
  • Size: 5.2 MB
  • Tags: CPython 3.13, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dirsql-0.2.3-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 c111825e5241be4609f4efafff47bdd73dbfd41c9de632f65f96ee98d92da1bc
MD5 e6af115f51497514f94c8aec648ae681
BLAKE2b-256 14d2b75973eea81b6a3cb85a40e0b1f9236875f00915b2b38c11b34cd416814e

See more details on using hashes here.

File details

Details for the file dirsql-0.2.3-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dirsql-0.2.3-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 e67b1ce683e72ed826b102dbe8346ab047b7e2ebf04d93799c06b6f5fd46fe56
MD5 4630805199e3c506431dba8e6f90f184
BLAKE2b-256 44df44e01efe591c567c79dd837654a765b0190dcb9f4073b7144908a2174073

See more details on using hashes here.

File details

Details for the file dirsql-0.2.3-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for dirsql-0.2.3-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 59a366a10141cd36247fd9ee64ee393df15a3e8196dcae016530e10b0df3cfaf
MD5 d18b93ab5cdb926379d2095fdb7f093c
BLAKE2b-256 9d0a682a5b1e448be0455fc48c48683a00c64474fc00d6aa9a90c07d0bb909b5

See more details on using hashes here.

File details

Details for the file dirsql-0.2.3-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dirsql-0.2.3-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 80729ef2947f90a215d0907e992da081e87b67b6c6ae302c99f24cb0f7fa447f
MD5 28062c736244193cdac4ba5331bb4830
BLAKE2b-256 8c9db51add9d1c6d4fd2eac32fab3cc58ea4782ddab9275ca1588cf866992ace

See more details on using hashes here.

File details

Details for the file dirsql-0.2.3-cp313-cp313-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for dirsql-0.2.3-cp313-cp313-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 4702a1b4a304185f090a516bc96c99cfd6a557b8d770c0f2517fed666fd0e1fb
MD5 72ed69f31ed0c915c524b7361fc63e0d
BLAKE2b-256 adac876063d99c0a7ad4bf687d66a0f857066d6cd0327e6f6d8cc7f7526356d6

See more details on using hashes here.

File details

Details for the file dirsql-0.2.3-cp312-cp312-win_amd64.whl.

File metadata

  • Download URL: dirsql-0.2.3-cp312-cp312-win_amd64.whl
  • Upload date:
  • Size: 5.2 MB
  • Tags: CPython 3.12, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dirsql-0.2.3-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 a6b35b4b1fc49787856583701d467ed3d4e77620d951abb78ea0f5704a114743
MD5 7440dd12b339514cc7a5ddc64c36b0ad
BLAKE2b-256 7500e9858de52a797073d26f64e85ff73e11899db6f2fc0bc85fe7e2782d49b6

See more details on using hashes here.

File details

Details for the file dirsql-0.2.3-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dirsql-0.2.3-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 516f4de821de434383109dd0b3a340ddc5e64a608d006c6bdc828cdb579359dc
MD5 a5cac9e89aa939b4cd283feab035127f
BLAKE2b-256 54ca340f689ec6c7a624ea500b84eabb3e4a177101aa59b2463a41a8487372fb

See more details on using hashes here.

File details

Details for the file dirsql-0.2.3-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for dirsql-0.2.3-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 a4bc052fece31942ee6c4d57460abef490333f28476a29acda619929e8cdfc3c
MD5 9f70efe174b9f889542a97947b09ed43
BLAKE2b-256 ab316283881525bcb02ecb8713ac30ae1cd9623c7682d052672a3fca99a18056

See more details on using hashes here.

File details

Details for the file dirsql-0.2.3-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dirsql-0.2.3-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 462e361fc5ddedbe244320523510bacbb0ccfa38726dee2506ce4c6977467940
MD5 3cd929f06edde8e76b6c75a4d9c5861b
BLAKE2b-256 c7fcddd359dee24d356661529ca956edd8acd790c02ef46f993846deec406b84

See more details on using hashes here.

File details

Details for the file dirsql-0.2.3-cp312-cp312-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for dirsql-0.2.3-cp312-cp312-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 31b7c579e632a738d14655ac88f28e6a5b46bd2124d1541129225d58f2de2b6f
MD5 b06912db93ebad36033cc63ee63960da
BLAKE2b-256 4b7f9a51d001ad1514c81f50577b6552acc7fe0e4a37828c9b2318ff81d2e3fa

See more details on using hashes here.

File details

Details for the file dirsql-0.2.3-cp311-cp311-win_amd64.whl.

File metadata

  • Download URL: dirsql-0.2.3-cp311-cp311-win_amd64.whl
  • Upload date:
  • Size: 5.2 MB
  • Tags: CPython 3.11, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dirsql-0.2.3-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 77ca27d35a5aa63027a954b12ada500fd605e4a40ced1d2c32973e72aa4d943f
MD5 81e9e5cd3cb13304ab84ba9e2773313a
BLAKE2b-256 1b9210f1b5bec2ebf68a653ead85a17ccf9238d452f183b249acde0fb26a68df

See more details on using hashes here.

File details

Details for the file dirsql-0.2.3-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dirsql-0.2.3-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 1f2fbab4b3d5dc77ce65ae4e51f65b3a3c856318e955a2cba7568adbfa80edbe
MD5 d5a66230817fced7f565b28d69015bd9
BLAKE2b-256 56ee4565ba6884d3544b9611285dd5997b72dd895440e345bf1ded7434fc60e8

See more details on using hashes here.

File details

Details for the file dirsql-0.2.3-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for dirsql-0.2.3-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 97abf8d08c760d57dd5e2f9ee929f4025f6199e2e65615623fd5fcf3221581b0
MD5 68636ea415d373165c725a65229b64ba
BLAKE2b-256 2364aafa6948a40d2c8286beffa1f9533f8c49c445688d4d070f0d89699060f8

See more details on using hashes here.

File details

Details for the file dirsql-0.2.3-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dirsql-0.2.3-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 ce70107b423eb33dd640267fe3cff99b73d4d5552f6666f8715c682e8d9ec52c
MD5 f766414dea39c0e8ea538977c67cf091
BLAKE2b-256 20bc3c5a2852f11a2f49c456076ea9ab30f4b6717b1b6e3bcdba3e55b7e0e1d2

See more details on using hashes here.

File details

Details for the file dirsql-0.2.3-cp311-cp311-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for dirsql-0.2.3-cp311-cp311-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 06562ae446d6e90d3fdd455e2ec8373fafc11bdbe00894a7a3a9aef21231c0bf
MD5 b40295bd0d9da5a2df8661d93724d1c1
BLAKE2b-256 691af0debe16ba9a2096f208ec0f4688d3d75fa43f8434743e8a6659ad6936ac

See more details on using hashes here.

File details

Details for the file dirsql-0.2.3-cp310-cp310-win_amd64.whl.

File metadata

  • Download URL: dirsql-0.2.3-cp310-cp310-win_amd64.whl
  • Upload date:
  • Size: 5.2 MB
  • Tags: CPython 3.10, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dirsql-0.2.3-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 b305c8801f1e78c8ddb692e9b17ddd3161af118aa91ff85a973586b2ec49221d
MD5 e4d40c1499ffa4e31fcb6d4f283d11eb
BLAKE2b-256 6100b2e33a4707218fe826560336f9ad6fcb98c531e1fad87027de1f99f4387b

See more details on using hashes here.

File details

Details for the file dirsql-0.2.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dirsql-0.2.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 9743b79827b67b53bac1451c88332aecd25edb53dce87ee2932a044deab4239e
MD5 e6e4514cb7701eb9b150ac15b4080cf7
BLAKE2b-256 ad99cd63b35fac07e51d0d2d86a7147ec21ab8ddbbd9e32f8af601b00d63bb9c

See more details on using hashes here.

File details

Details for the file dirsql-0.2.3-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for dirsql-0.2.3-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 8b4f1d3107ceb9960748c95a0d87bcf723e02b3456064eba0b6e5ce92591ac6a
MD5 5e21ab475edcedebbc9f47a9980496ec
BLAKE2b-256 eb92633c659206dad985a192cd78f779446c1f2c936b9f4617a41a7a99a8e83d

See more details on using hashes here.

File details

Details for the file dirsql-0.2.3-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dirsql-0.2.3-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 24ec1869dd36d7409f0b174916862aa51bd9119048f9aec205a50db08dcf4ac9
MD5 c21647f2632c74d8d17a1c18ae1435aa
BLAKE2b-256 5824576a3d6b0b883176a0ab26c298ff2fc78f701f171d307add21e50ab75630

See more details on using hashes here.

File details

Details for the file dirsql-0.2.3-cp310-cp310-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for dirsql-0.2.3-cp310-cp310-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 e8b6134454fa1c8384d2516aabe8e7831b8dab099f44fdf0771612b8c1b10c21
MD5 753bcdf8f5f643a890c5412129cd49d5
BLAKE2b-256 3286c15eaf0cc1afb8abe6df7beb5dff85a3aa662336244242d8ac316605bbee

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page