Skip to main content

MDQL — a queryable database where every entry is a markdown file

Project description

MDQL

A database where every entry is a markdown file and every change is a readable diff.

MDQL turns folders of markdown files into a schema-validated, queryable database. Frontmatter fields are metadata columns. H2 sections are content columns. The files are the database — there is nothing else. Every file reads like a normal markdown document, but you get full SQL: SELECT, INSERT, UPDATE, DELETE, JOINs across multiple tables, CTEs, views, ORDER BY, aggregation, computed expressions, and CASE WHEN.

Your database lives in git. Every insert, update, and migration is a readable diff. Branching, merging, and rollback come free.

Install

cargo install mdql          # from source via Cargo
brew install mdql-db/tap/mdql  # macOS / Linux via Homebrew
pip install mdql             # Python bindings

Quick start

mdql validate examples/strategies/
# All 100 files valid in table 'strategies'

mdql query examples/strategies/ \
  "SELECT title, composite FROM strategies ORDER BY composite DESC LIMIT 5"
title                                                                composite
-------------------------------------------------------------------  ---------
Bridge Inflow to Destination Chain → DEX Liquidity Pressure                500
DeFi Protocol TVL Step-Change → Governance Token Repricing Lag             500
Lending Protocol Daily Interest Accrual Liquidation Threshold Creep        500
USDC Circle Business-Day Redemption Queue — Weekend Premium Decay          490
Cascading Liquidation Chain — Second-Order Collateral Asset Short          480

Why MDQL

  • Zero infrastructure. No server, no Docker, no connection strings. git clone and you have the database. rm -rf and it's gone.
  • Data review via pull requests. Data changes go through the same PR review process as code. A reviewer reads the diff of an INSERT the way they read a code change.
  • Branch-level isolation. An agent works on a feature branch, inserts and updates entries freely, and the main database is untouched until merge. Multiple agents work in parallel without coordination.
  • No serialization boundary. The storage format is the readable format. An LLM sees a well-structured markdown document, not a JSON blob or SQL dump.
  • Graceful degradation. If you stop using MDQL tomorrow, you still have a folder of valid markdown files. No proprietary format to export from.
  • Section-level content columns. Long-form structured prose — a hypothesis, a methodology, kill criteria — is a first-class queryable column. SELECT Hypothesis FROM strategies WHERE status = 'LIVE'.
  • Every unix tool still works. grep -r "funding" strategies/ works. wc -l strategies/*.md works. diff works.
  • Self-documenting schemas. The schema file is a markdown document. Its body explains the fields, conventions, and rationale. An LLM reading _mdql.md gets both the machine-readable schema and the human context for why fields exist.
  • Schema migrations are diffs. ALTER TABLE RENAME FIELD rewrites every file. The migration shows up as a git diff.
  • Audit trail for free. git blame strategies/bad-debt-socialization-event-token-short.md tells you who changed what and when.

Directory structure

my-project/
  _mdql.md                    # type: database — config + foreign keys
  strategies/
    _mdql.md                  # type: schema — table schema + docs
    bad-debt-socialization-event-token-short.md
    aave-utilization-kink-rate-spike-borrow-unwind-short.md
    ...
  backtests/
    _mdql.md                  # type: schema
    bt-bad-debt-socialization-binance.md
    ...
  src/                        # no _mdql.md — invisible to MDQL
  docs/                       # no _mdql.md — invisible to MDQL

A _mdql.md file marks a directory as part of an MDQL database. The type field in frontmatter determines what it is — database at the root, schema in each table folder. Directories without _mdql.md are ignored, so MDQL coexists with any project structure.

How it works

One folder = one table. One markdown file = one row.

A row file looks like this:

---
title: "Bad Debt Socialization Event — Token Short"
status: HYPOTHESIS
mechanism: 7
categories:
  - defi-protocol
  - lending
created: "2026-04-03T14:22:01"
modified: "2026-04-05T09:15:33"
---

## Hypothesis

When an on-chain lending protocol accumulates bad debt that exceeds
its reserve buffer, the smart contract mints governance tokens...

## Structural Mechanism

The protocol's shortfall module triggers an auction...
  • Frontmatter = metadata columns (title, status, mechanism, ...) — typed, validated, queryable
  • H2 sections = content columns (Hypothesis, Structural Mechanism, ...) — queryable long-form text
  • H1 = decorative only. Not queryable, not stored as a column. Present for human readability in editors and GitHub rendering (standard markdown convention: one H1 per document as the title)
  • Loose body (text not under any H2) = rejected. All content must live under an ## Heading to be queryable. This prevents silent data loss.
  • The path (filename) is the implicit primary key
  • created and modified are reserved datetime fields (ISO 8601, e.g. "2026-04-03T14:22:01"), auto-managed by mdql stamp
  • All columns are queryable with SQL

_mdql.md files

Every MDQL-managed directory has a _mdql.md file. The type field in frontmatter says what kind.

Table schema (type: schema)

---
type: schema
table: strategies
primary_key: path

frontmatter:
  title:
    type: string
    required: true
  mechanism:
    type: int
    required: true
  categories:
    type: string[]
    required: true

h1:
  required: false

sections: {}

rules:
  reject_unknown_frontmatter: true
  reject_unknown_sections: false
  reject_duplicate_sections: true
---

# strategies

Documentation about this table goes here.

Supported types: string, int, float, bool, date, datetime, string[], dict

The dict type stores a YAML mapping. Values can be scalars, lists, or nested dicts. Use dot-access in queries: SELECT params.entry_days FROM strategies.

params:
  threshold: 0.5
  blocked_tokens:
    - ZK
    - W
  enabled: true

Database config (type: database)

---
type: database
name: zunid

foreign_keys:
  - from: backtests.strategy
    to: strategies.path

views:
  - name: live_strategies
    query: "SELECT * FROM strategies WHERE status = 'LIVE'"
  - name: strategy_performance
    query: "SELECT s.title, b.sharpe FROM strategies s JOIN backtests b ON b.strategy = s.path"
---

# zunid

Trading strategy research database.

The markdown body in both cases is documentation — ignored by the engine, useful for humans and LLMs.

Foreign key validation

Foreign keys defined in the database config are validated automatically. No setup required.

At load time: Every call to load_database() checks all FK constraints. If backtests.strategy references a file that does not exist in strategies.path, the error is returned alongside the data. CLI commands (query, validate, repl) print FK warnings to stderr.

In the REPL: A filesystem watcher runs in the background. If you rename or delete a file in another terminal, the REPL detects the change within 500ms and prints any new FK violations.

In the web UI: Same filesystem watcher runs as a background task. FK errors are available at GET /api/fk-errors.

With mdql validate: When pointed at a database directory (not just a single table), reports per-table schema validation summaries followed by FK violations:

mdql validate examples/
Table 'strategies': 100 files valid
Table 'backtests': 18 files valid
Foreign key violations:
  backtests/bt-broken.md: strategy = 'nonexistent.md' not found in strategies

NULL FK values are not violations — a backtest with no strategy set is valid.

Views

Views are named queries that act as virtual tables. Define them with standard SQL, query them like any other table.

Creating and dropping views

# Create a view (persisted in _mdql.md)
mdql query examples/ "CREATE VIEW live AS SELECT * FROM strategies WHERE status = 'LIVE'"

# Query the view like a regular table
mdql query examples/ "SELECT title, mechanism FROM live ORDER BY mechanism DESC"

# Drop a view
mdql query examples/ "DROP VIEW live"

# Views support the full query syntax: GROUP BY, HAVING, aggregate arithmetic
mdql query examples/ "CREATE VIEW positions AS SELECT token, SUM(CASE WHEN side = 'BUY' THEN size ELSE 0 END) - SUM(CASE WHEN side = 'SELL' THEN size ELSE 0 END) as net FROM orders GROUP BY token HAVING net > 0"

Views require a database directory (not a single table folder). They are stored in the views: section of the database-level _mdql.md and re-executed dynamically on each query — no cached data on disk.

Restrictions

  • Read-only. INSERT INTO, UPDATE, and DELETE FROM a view return a clear error.
  • No view-to-view references. A view query cannot reference another view.
  • Name conflicts. A view cannot have the same name as a physical table.

Python API

db = Database("examples/")

# Create and drop views
db.execute("CREATE VIEW live AS SELECT * FROM strategies WHERE status = 'LIVE'")
db.execute("DROP VIEW live")

# Query a view (same as querying a table)
rows, columns = db.query("SELECT * FROM live")

# List view names
db.view_names  # ['live', ...]

Schema display

mdql schema examples/
# ...
# Views:
#   live = SELECT * FROM strategies WHERE status = 'LIVE'

CTEs (Common Table Expressions)

Use WITH ... AS to define temporary named result sets within a query. CTEs are materialized in order, so later CTEs can reference earlier ones.

# Single CTE
mdql query examples/ "WITH live AS (SELECT * FROM strategies WHERE status = 'LIVE') SELECT * FROM live ORDER BY title"

# Multiple CTEs with JOIN
mdql query examples/ "WITH s AS (SELECT * FROM strategies WHERE status = 'LIVE'), b AS (SELECT * FROM backtests WHERE sharpe > 1.0) SELECT s.title, b.sharpe FROM s JOIN b ON b.strategy = s.path"

# Chained CTEs — second CTE references the first
mdql query examples/ "WITH good AS (SELECT * FROM backtests WHERE sharpe > 1.0), matched AS (SELECT s.title, g.sharpe FROM strategies s JOIN good g ON g.strategy = s.path) SELECT * FROM matched"

# CTE with aggregation
mdql query examples/ "WITH counts AS (SELECT status, COUNT(*) AS cnt FROM strategies GROUP BY status) SELECT * FROM counts WHERE cnt > 1"
# Python API
rows, columns = db.query(
    "WITH live AS (SELECT * FROM strategies WHERE status = 'LIVE') "
    "SELECT * FROM live ORDER BY title"
)

CTEs require a database directory. A CTE can shadow a physical table name — the CTE version takes precedence within the query.

Python API

pip install mdql

Database and Table

from mdql import Database, Table

db = Database("examples/")
strategies = db.table("strategies")

SELECT with JOINs

Database.query() runs SQL across all tables in the database, including INNER JOIN and LEFT JOIN.

rows, columns = db.query(
    "SELECT s.title, b.sharpe, b.status "
    "FROM strategies s "
    "JOIN backtests b ON b.strategy = s.path"
)

# LEFT JOIN keeps all left rows, fills NULLs for unmatched
rows, columns = db.query(
    "SELECT s.title, b.sharpe "
    "FROM strategies s "
    "LEFT JOIN backtests b ON b.strategy = s.path"
)

# Compound ON conditions — filter during join
rows, columns = db.query(
    "SELECT s.title, b.sharpe "
    "FROM strategies s "
    "LEFT JOIN backtests b ON b.strategy = s.path AND b.mode = 'PAPER'"
)

Single-table queries

Table.query() runs a SELECT query on one table and returns structured results.

rows, columns = strategies.query(
    "SELECT status, COUNT(*) AS cnt FROM strategies GROUP BY status"
)
# rows: list of dicts
# columns: list of column names

# Computed expressions and CASE WHEN
rows, columns = strategies.query(
    "SELECT title, mechanism * safety score, "
    "CASE WHEN mechanism >= 7 THEN 'high' ELSE 'low' END tier "
    "FROM strategies ORDER BY score DESC"
)

# Conditional aggregation
rows, columns = strategies.query(
    "SELECT SUM(CASE WHEN status = 'LIVE' THEN 1 ELSE 0 END) live_count, "
    "COUNT(*) total FROM strategies"
)

Load rows with filtering

Table.load() returns all rows, optionally filtered by a dict of field values.

# All rows
rows, errors = strategies.load()

# Filtered by dict — equality matching
rows, errors = strategies.load(where={"status": "LIVE"})

# Filtered by SQL WHERE string — full operator support
rows, errors = strategies.load(where="mechanism >= 7 AND status = 'HYPOTHESIS'")
rows, errors = strategies.load(where="categories LIKE '%defi%'")

The where parameter accepts a dict (equality matching) or a SQL WHERE string (supports =, !=, <, >, <=, >=, LIKE, IN, IS NULL, AND, OR). errors contains any schema validation issues found during loading.

INSERT

# Create a new row — filename derived from title
strategies.insert({
    "title": "My New Strategy",
    "status": "HYPOTHESIS",
    "mechanism": 5,
    "implementation": 4,
    "safety": 7,
    "frequency": 3,
    "composite": 420,
    "categories": ["exchange-structure"],
    "pipeline_stage": "Pre-backtest (step 2 of 9)",
})
# Returns: Path to created file (e.g. my-new-strategy.md)
# created/modified timestamps set automatically
# required sections scaffolded as empty ## headings
# validated against schema before writing

# With pre-formatted body (e.g. from Claude output)
strategies.insert(
    {"title": "Another Strategy", "status": "HYPOTHESIS", ...},
    body=raw_markdown,  # placed verbatim after frontmatter
)

# Overwrite existing file, preserve created timestamp
strategies.insert(
    {"title": "Revised Strategy", "status": "BACKTESTING", ...},
    filename="my-new-strategy",
    replace=True,
)

UPDATE

# Partial merge — only the fields you pass are changed
strategies.update("my-new-strategy.md", {"status": "KILLED", "kill_reason": "No edge"})

# Update body only
strategies.update("my-new-strategy.md", {}, body=new_markdown)

Bulk UPDATE

Table.update_many() updates the same fields across multiple files.

updated_paths = strategies.update_many(
    ["file-a.md", "file-b.md", "file-c.md"],
    {"status": "KILLED"},
)
# Returns: list of paths that were updated

DELETE

# Delete a single file
strategies.delete("my-new-strategy.md")

# CASCADE — delete row and all FK-dependent rows (backtests, events, etc.)
db.delete("strategies", "status = 'KILLED'", cascade=True)

# RESTRICT — error if any FK-dependent rows exist
db.delete("strategies", "path = 'alpha.md'", restrict=True)

# Dry-run — preview the cascade/restrict plan without executing
plan = db.delete("strategies", "status = 'KILLED'", cascade=True, dry_run=True)
# Returns dict: {"primary_deletes": [...], "cascade_actions": [...], "restrict_violations": [...]}

# Via SQL
db.execute("DELETE FROM strategies WHERE status = 'KILLED' CASCADE")
db.execute("DELETE FROM strategies WHERE path = 'alpha.md' RESTRICT")

CASCADE walks the FK graph via BFS. Scalar FKs (e.g. backtests.strategy -> strategies.path) trigger dependent row deletion. List FKs (e.g. string[] columns) prune the deleted value from the list without deleting the row. RESTRICT checks for any dependent references and blocks the delete if found.

Rename

# Rename an entry — cascades FK references in other tables
db.rename("strategies", "old-name.md", "new-name.md")

# Rename a table — updates schema + FK config
db.rename_table("strategies", "strats")

Schema operations

table = Table("examples/strategies/")

table.rename_field("Summary", "Overview")     # section or frontmatter
table.drop_field("Details")                   # section or frontmatter
table.merge_fields(["Entry Rules", "Exit Rules"], into="Trading Rules")  # sections only

Validation

errors = strategies.validate()
# Returns: list of validation errors (schema + FK)

All writes are validated against the schema and rolled back on failure. The created timestamp is always preserved on replace and update; modified is always set to today.

CLI commands

mdql --version

Print the installed version.

mdql --version
# mdql 0.5.25

mdql query <folder> "<sql>"

Run SQL against a table or database. Supports SELECT, INSERT INTO, UPDATE SET, DELETE FROM, ALTER TABLE, JOIN, CREATE VIEW, and DROP VIEW.

# Filter and sort
mdql query examples/strategies/ \
  "SELECT title FROM strategies WHERE mechanism > 5 ORDER BY composite DESC LIMIT 5"

# Query section content
mdql query examples/strategies/ \
  "SELECT path, Hypothesis FROM strategies WHERE Hypothesis IS NOT NULL LIMIT 3"

# Category search (LIKE works on arrays)
mdql query examples/strategies/ \
  "SELECT title FROM strategies WHERE categories LIKE '%defi%'"

# Output as JSON
mdql query examples/strategies/ \
  "SELECT title, composite FROM strategies LIMIT 3" --format json

Supported WHERE operators: =, !=, <, >, <=, >=, LIKE, IN, IS NULL, IS NOT NULL, AND, OR

Column names with spaces use backticks: SELECT `Structural Mechanism` FROM strategies

Computed expressions

Arithmetic expressions (+, -, *, /, %) work in SELECT, WHERE, and ORDER BY. Supports parentheses, unary minus, and mixed int/float coercion.

# Computed columns with aliases
mdql query examples/strategies/ \
  "SELECT title, mechanism * safety total_score FROM strategies ORDER BY total_score DESC LIMIT 5"

# Expressions in WHERE
mdql query examples/strategies/ \
  "SELECT title FROM strategies WHERE mechanism + implementation > 10"

# Parenthesized expressions
mdql query examples/strategies/ \
  "SELECT title, (mechanism + implementation) / 2 avg_score FROM strategies"

Integer division truncates (7 / 2 = 3). Division by zero returns NULL. NULL propagates through all arithmetic.

Column aliases

Columns can be aliased with AS or by placing the alias directly after the expression (implicit alias). ORDER BY can reference SELECT aliases.

# Explicit alias with AS
mdql query examples/ \
  "SELECT s.title AS name, b.sharpe AS ratio FROM strategies s JOIN backtests b ON b.strategy = s.path"

# Implicit alias (no AS keyword)
mdql query examples/ \
  "SELECT s.composite comp, b.edge_vs_random edge FROM strategies s JOIN backtests b ON b.strategy = s.path ORDER BY edge DESC"

CASE WHEN

CASE WHEN expressions work anywhere a value is expected — in SELECT, WHERE, ORDER BY, and inside aggregate functions.

# Categorize rows
mdql query examples/strategies/ \
  "SELECT title, CASE WHEN mechanism >= 7 THEN 'high' WHEN mechanism >= 4 THEN 'medium' ELSE 'low' END rating FROM strategies"

# Conditional aggregation
mdql query examples/strategies/ \
  "SELECT COUNT(*) total, SUM(CASE WHEN mechanism >= 7 THEN 1 ELSE 0 END) high_mechanism FROM strategies"

GROUP BY, HAVING, and aggregation

# Count by status
mdql query examples/strategies/ \
  "SELECT status, COUNT(*) cnt FROM strategies GROUP BY status"

# HAVING filters groups after aggregation
mdql query examples/strategies/ \
  "SELECT status, COUNT(*) cnt FROM strategies GROUP BY status HAVING COUNT(*) > 10"

Aggregate arithmetic — combine aggregates with +, -, *, /:

# Net position per token
mdql query examples/ \
  "SELECT token, SUM(CASE WHEN side = 'BUY' THEN size ELSE 0 END) - SUM(CASE WHEN side = 'SELL' THEN size ELSE 0 END) as net FROM orders GROUP BY token"

# Average via SUM/COUNT
mdql query examples/ \
  "SELECT SUM(mechanism) / COUNT(*) as avg_mechanism FROM strategies"

Supported aggregate functions: COUNT(*), COUNT(col), SUM(expr), AVG(expr), MIN(expr), MAX(expr).

Subqueries

Use a subquery in the FROM clause to compute derived values in a single query:

mdql query examples/ \
  "SELECT token, sell_size, buy_size FROM (SELECT token, SUM(CASE WHEN side = 'SELL' THEN size ELSE 0 END) as sell_size, SUM(CASE WHEN side = 'BUY' THEN size ELSE 0 END) as buy_size FROM orders GROUP BY token) WHERE sell_size > buy_size"

Subqueries support the full SELECT syntax including WHERE, GROUP BY, HAVING, and ORDER BY.

Date arithmetic

# Rows created in the last 30 days
mdql query examples/strategies/ \
  "SELECT title, created FROM strategies WHERE created >= CURRENT_DATE - INTERVAL 30 DAYS"

# Days since creation
mdql query examples/strategies/ \
  "SELECT title, DATEDIFF(CURRENT_DATE, created) days_old FROM strategies ORDER BY days_old DESC LIMIT 5"

# Future date calculation
mdql query examples/strategies/ \
  "SELECT title, modified + INTERVAL 7 DAY review_due FROM strategies"
  • CURRENT_DATE — today's date
  • CURRENT_TIMESTAMP — current datetime
  • DATEDIFF(date1, date2) — returns number of days between two dates (date1 - date2)
  • date + INTERVAL N DAY / date - INTERVAL N DAYS — add or subtract days from a date or datetime

JOINs

Point at the database directory (parent of table folders) for cross-table queries. Supports INNER JOIN and LEFT JOIN with two or more tables:

# Two-table JOIN
mdql query examples/ \
  "SELECT s.title, b.sharpe, b.status
   FROM strategies s
   JOIN backtests b ON b.strategy = s.path"

# LEFT JOIN — keeps all left rows, fills NULLs for unmatched right rows
mdql query examples/ \
  "SELECT s.title, b.sharpe
   FROM strategies s
   LEFT JOIN backtests b ON b.strategy = s.path"

# Multi-table JOIN (mix INNER and LEFT)
mdql query my-db/ \
  "SELECT s.title, b.result, c.verdict
   FROM strategies s
   JOIN backtests b ON b.strategy = s.path
   LEFT JOIN critiques c ON c.strategy = s.path"

# Compound ON conditions (AND/OR)
mdql query examples/ \
  "SELECT s.title, b.sharpe
   FROM strategies s
   LEFT JOIN backtests b ON b.strategy = s.path AND b.mode = 'PAPER'"

mdql query examples/ \
  "SELECT s.title, b.sharpe
   FROM strategies s
   JOIN backtests b ON b.strategy = s.path AND b.sharpe > 1.0"

SQL write operations

# INSERT
mdql query examples/strategies/ \
  "INSERT INTO strategies (title, status, mechanism, implementation, safety, frequency, composite, categories, pipeline_stage)
   VALUES ('New Strategy', 'HYPOTHESIS', 5, 4, 7, 3, 420, 'exchange-structure', 'Pre-backtest')"

# UPDATE
mdql query examples/strategies/ \
  "UPDATE strategies SET status = 'KILLED', kill_reason = 'No edge' WHERE path = 'new-strategy.md'"

# DELETE
mdql query examples/strategies/ \
  "DELETE FROM strategies WHERE path = 'new-strategy.md'"

# CASCADE DELETE — removes row and all FK-dependent rows
mdql query examples/ \
  "DELETE FROM strategies WHERE status = 'KILLED' CASCADE"
# DELETE 3 (cascade: 7 deleted, 2 list refs pruned)

# RESTRICT DELETE — errors if dependent rows exist
mdql query examples/ \
  "DELETE FROM strategies WHERE path = 'alpha.md' RESTRICT"

# Dry-run — preview what would happen without executing
mdql query examples/ --dry-run \
  "DELETE FROM strategies WHERE status = 'KILLED' CASCADE"

For string[] columns, pass comma-separated values in a single string: 'funding-rates,defi'.

ALTER TABLE — field migrations

Rename, drop, or merge fields across all files in a table. Works for both frontmatter fields and sections. The schema _mdql.md is updated automatically.

mdql query examples/strategies/ \
  "ALTER TABLE strategies RENAME FIELD 'Summary' TO 'Overview'"
# ALTER TABLE — renamed 'Summary' to 'Overview' in 42 files

mdql query examples/strategies/ \
  "ALTER TABLE strategies DROP FIELD 'Details'"

mdql query examples/strategies/ \
  "ALTER TABLE strategies MERGE FIELDS 'Entry Rules', 'Exit Rules' INTO 'Trading Rules'"

Field names can be single-quoted ('Name'), backtick-quoted (`Name With Spaces`), or bare identifiers.

mdql rename <db-folder> <table> <old-name> <new-name>

Rename a file within a table. Automatically updates all foreign key references in other tables that point to the old filename.

mdql rename examples/ strategies bad-debt-socialization-event-token-short.md bad-debt-token-short.md
# Renamed strategies/bad-debt-socialization-event-token-short.md → bad-debt-token-short.md
# Updated 3 references in backtests

mdql rename-table <db-folder> <old-name> <new-name>

Rename a table directory. Updates the schema table: field and all foreign key references in the database config.

mdql rename-table examples/ strategies strats
# RENAME TABLE strategies → strats

mdql create <folder> --set key=value

Create a new row file. Field types are coerced from the schema (e.g. --set mechanism=5 becomes int).

mdql create examples/strategies/ \
  -s 'title=My New Strategy' \
  -s 'status=HYPOTHESIS' \
  -s 'mechanism=5' \
  -s 'implementation=4' \
  -s 'safety=7' \
  -s 'frequency=3' \
  -s 'composite=420' \
  -s 'categories=exchange-structure' \
  -s 'pipeline_stage=Pre-backtest (step 2 of 9)'

For string[] fields, use comma-separated values: -s 'categories=funding-rates,defi'

mdql validate <folder>

Validate all markdown files against the schema. Works on a single table or a database directory.

mdql validate examples/strategies/
# All 100 files valid in table 'strategies'

Invalid files get clear error messages:

missing-field.md: Missing required frontmatter field 'count'
wrong-type-date.md: Field 'created' expected datetime (ISO 8601), got string 'yesterday'
duplicate-section.md: Duplicate section 'Body' (appears 2 times)
loose-note.md: Body content not under an H2 section is not allowed; wrap in ## heading

Files with body content not under an H2 heading are rejected (the row is excluded from query results). All prose must be wrapped in ## Heading sections to be queryable. This prevents silent data loss where text exists on disk but is invisible to MDQL queries.

When pointed at a database directory, also reports foreign key violations (see Foreign key validation).

Exits non-zero on any validation error, so it works as a CI gate or git pre-commit hook:

# Silent mode for CI — exit code only, no output
mdql validate --quiet

# Explicit --strict flag (same behavior as default, accepted for clarity)
mdql validate --strict examples/

Pre-commit hook (.git/hooks/pre-commit):

#!/bin/sh
mdql validate --quiet

mdql inspect <folder>

Show normalized rows.

mdql inspect examples/strategies/ -f bad-debt-socialization-event-token-short.md --format json

mdql stamp <folder>

Add or update created and modified timestamps in all data files.

mdql stamp examples/strategies/
# Stamped 100 files: 0 created set, 100 modified updated
  • created is set to the current ISO 8601 timestamp if missing, never overwritten
  • modified is always updated to the current ISO 8601 timestamp
  • Both are ISO datetime strings ("YYYY-MM-DDTHH:MM:SS") in frontmatter
  • These fields are reserved — schemas don't need to declare them, and they are never rejected as unknown fields

mdql schema <folder>

Print the effective schema. Works on a single table or the whole database.

mdql schema examples/

mdql repl <folder>

Open an interactive REPL for running queries. Supports tab completion for table names, column names, and SQL keywords.

mdql repl examples/

When pointed at a database directory, runs a background filesystem watcher that prints FK violations to stderr if files change on disk while the REPL is open.

mdql client <folder>

Open a browser-based UI for running queries. Starts a local web server with a query editor.

mdql client examples/

The web server exposes a REST API:

  • POST /api/query — execute SQL
  • GET /api/fk-errors — current foreign key violations (updated by background watcher)

mdql checksums <folder>

Generate or verify sidecar checksums for tamper detection. Each table gets a _checksums.json file tracking xxHash64 of every row file.

# Generate checksums from current file state
mdql checksums examples/strategies/ --regenerate

# Verify files against stored checksums
mdql checksums examples/strategies/ --verify

When checksums exist, MDQL automatically updates them on insert/update/delete. On load, files that don't match their checksum are flagged with _modified_externally: true in the row data.

from mdql.migrate import regenerate_checksums

regenerate_checksums("examples/strategies/")

rows, errors = strategies.load()
modified = [r for r in rows if r.get("_modified_externally")]

Multi-agent setup

MDQL is a single-writer, filesystem-based database. When multiple agents or processes need to read and write the same data, point them all at the same directory. MDQL's flock locking serializes writes automatically.

For multi-agent setups, keep the database in its own directory (and optionally its own git repo for audit trail), separate from application code:

~/repos/
  my-project/         # application code — branched freely
  my-project-db/      # MDQL database — shared by all agents
    _mdql.md
    strategies/
    orders/

MDQL_DATABASE_PATH

Set the MDQL_DATABASE_PATH environment variable so agents and CLI commands find the database without hardcoding paths.

export MDQL_DATABASE_PATH=~/repos/my-project-db

# CLI commands fall back to this when no folder is given
mdql validate
mdql repl
from mdql import Database

# Reads MDQL_DATABASE_PATH when no path is given
db = Database()

An explicit path always takes precedence: Database("/other/path") and mdql validate /other/path ignore the env var.

Pandas integration

pip install mdql[pandas]

One-liner

from mdql.pandas import load_dataframe

df = load_dataframe("examples/strategies/")

Two-step (when you already have rows)

from mdql.loader import load_table
from mdql.pandas import to_dataframe

schema, rows, errors = load_table("examples/strategies/")
df = to_dataframe(rows, schema)

Schema types map to pandas dtypes:

MDQL type pandas dtype
string string
int Int64 (nullable)
float Float64 (nullable)
bool boolean (nullable)
date datetime64[ns]
datetime datetime64[ns]
string[] Python lists
dict Python dicts

Validation errors are handled via the errors parameter: "warn" (default), "raise", or "ignore".

ACID compliance

All write operations are process-safe. Three layers of protection:

Atomic writes. Every file write goes through a temp-file-then-rename path. If the process crashes mid-write, the original file is untouched.

Table locking. Write operations acquire an exclusive fcntl.flock per table. Two processes writing to the same table serialize rather than corrupt each other's files.

Write-ahead journal. Multi-file operations (ALTER TABLE, batch UPDATE/DELETE, stamp) write a journal before making changes. If the process crashes mid-operation, the next Table() construction detects the journal and rolls back all partial changes automatically.

# Safe even if the process is killed mid-way:
table.rename_field("Summary", "Overview")  # touches 100 files + schema
# On crash: next Table("strategies/") auto-recovers from journal

Running tests

# Rust tests
cargo test

# Python tests (requires maturin develop first)
pytest

Project structure

crates/
  mdql-core/        # core library: parser, schema, validator, query engine,
                     # indexes, caching, full-text search, ACID transactions,
                     # FK validation, filesystem watcher
  mdql/             # CLI binary: validate, query, create, inspect, schema,
                     # stamp, rename, repl (with autocomplete), client (web UI)
  mdql-web/         # browser UI: axum REST server + embedded SPA
python/
  src/lib.rs        # PyO3 bindings (Rust → Python)
  mdql/             # Python wrapper package (thin layer over Rust)
tests/              # Python test suite
examples/           # example data (strategies, backtests)

License

AGPL-3.0. Commercial licenses available — see LICENSE.md.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mdql-0.5.28.tar.gz (141.7 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

mdql-0.5.28-cp312-cp312-win_amd64.whl (1.6 MB view details)

Uploaded CPython 3.12Windows x86-64

mdql-0.5.28-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.0 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

mdql-0.5.28-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.9 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ ARM64

mdql-0.5.28-cp312-cp312-macosx_11_0_arm64.whl (1.7 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

mdql-0.5.28-cp312-cp312-macosx_10_12_x86_64.whl (1.7 MB view details)

Uploaded CPython 3.12macOS 10.12+ x86-64

File details

Details for the file mdql-0.5.28.tar.gz.

File metadata

  • Download URL: mdql-0.5.28.tar.gz
  • Upload date:
  • Size: 141.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for mdql-0.5.28.tar.gz
Algorithm Hash digest
SHA256 6b7aa28002abe3a131866d8459d12a5a5b257767ae249e5a6f8005a4bcb562b0
MD5 bfa45fcb7ab69b09de6e1b898d902ed6
BLAKE2b-256 f3d76765b3d6ec690f1d70a777798ed62ccdd62066caf92cd5fc88954fc9f45a

See more details on using hashes here.

Provenance

The following attestation bundles were made for mdql-0.5.28.tar.gz:

Publisher: release.yml on mdql-db/mdql

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file mdql-0.5.28-cp312-cp312-win_amd64.whl.

File metadata

  • Download URL: mdql-0.5.28-cp312-cp312-win_amd64.whl
  • Upload date:
  • Size: 1.6 MB
  • Tags: CPython 3.12, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for mdql-0.5.28-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 c734b63d14bc616f99b38c9b58e9496e67eb00fccbcef3f6f0f5f0082f122f22
MD5 bad07b646ac08478f144873341a2100e
BLAKE2b-256 a697f06d3207a56d60dd3bcde31b981f868fcdf0d7fd529ede5e4ee24075ce27

See more details on using hashes here.

Provenance

The following attestation bundles were made for mdql-0.5.28-cp312-cp312-win_amd64.whl:

Publisher: release.yml on mdql-db/mdql

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file mdql-0.5.28-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for mdql-0.5.28-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 08085b137948cb30beee66c0275bd14d12d8846b4732fa21d4e47c5a59198b1c
MD5 a5a0b28cb1f1cb48a8141b6e5c024f04
BLAKE2b-256 02c97257b095b26ec4679ef6613e89decf357e1e700bdcca07d0d966c6505ca6

See more details on using hashes here.

Provenance

The following attestation bundles were made for mdql-0.5.28-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: release.yml on mdql-db/mdql

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file mdql-0.5.28-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for mdql-0.5.28-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 8a5b9e700dc4e70d082f62974bd92d1809142fcd4e9d88698e578bad7b01c663
MD5 091efb1d67e2b2152d7eea9584f109e4
BLAKE2b-256 bf577e15d1682123db5371149c51eabc53617722e91011b34481b4db315dff4a

See more details on using hashes here.

Provenance

The following attestation bundles were made for mdql-0.5.28-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl:

Publisher: release.yml on mdql-db/mdql

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file mdql-0.5.28-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for mdql-0.5.28-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 0f996d6b57ad738b92f11c06470de22af13bceaf8e20b34e61bf8ab8fdce419e
MD5 11d975b61b7ca1785788530b4283f426
BLAKE2b-256 b56669df17c4f6cf80ea3757fdd6b170d14bf8c70dce1d9c6187851db746dc19

See more details on using hashes here.

Provenance

The following attestation bundles were made for mdql-0.5.28-cp312-cp312-macosx_11_0_arm64.whl:

Publisher: release.yml on mdql-db/mdql

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file mdql-0.5.28-cp312-cp312-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for mdql-0.5.28-cp312-cp312-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 99a1b3a37bf4cfd3cceb1f3652ba8c8662804d86f9e72046345f84630231dd71
MD5 fc7c2de8229d5d87c51c72326fde8f95
BLAKE2b-256 1e82940eabf772cc8989d3f3bf2e36eb51067ceca2c5f1a9c5712ccc1b386e6a

See more details on using hashes here.

Provenance

The following attestation bundles were made for mdql-0.5.28-cp312-cp312-macosx_10_12_x86_64.whl:

Publisher: release.yml on mdql-db/mdql

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page