Parser, generator, merger, locator, and renderer for MD-LD (Markdown-Linked Data), a human-friendly RDF authoring format
Project description
mdld-parse (Python)
Markdown-Linked Data (MD-LD) — a deterministic, streaming-friendly RDF
authoring format that extends CommonMark with explicit {...} annotations.
A Python implementation of the MD-LD format. The format itself was
designed by @davay42, whose JavaScript
davay42/mdld-parse is the
companion reference for the syntax.
Quick start
python -m venv .venv
source .venv/bin/activate
pip install -e .
from mdld_parse import parse
text = """
[ex] <http://example.org/>
# Document {=ex:doc .ex:Article label}
[Alice] {?ex:author =ex:alice .prov:Person ex:firstName label}
[Smith] {ex:lastName}
"""
result = parse({'text': text})
for q in result['quads']:
print(q.subject.value, q.predicate.value, q.object.value)
The quads list contains RDF/JS-shaped Quad objects (Python dataclasses
exposing subject, predicate, object, graph); result['origin'] carries
the lean source map; result['primary_subject'] names the document's central
entity.
What MD-LD is
MD-LD is RDF you can read, write, diff, and share without leaving Markdown.
You add semantics by attaching {...} annotations to value carriers (links,
emphasis, headings, list items, code blocks, blockquotes). Strip the
annotations and you are left with normal Markdown.
Three things may be in scope at any annotation:
| Symbol | Meaning |
|---|---|
| S | current subject (an IRI) |
| O | object IRI (from a link, image, or {+iri}) |
| L | literal text from the attached value carrier |
No subject, no triple. No {...}, no semantics. Nothing is inferred.
Core features
- Prefix folding — build hierarchical namespaces by composing prefixes
- Subject declarations —
{=IRI}and{=#fragment}set context - Soft objects —
{+IRI},{+#fragment}declare temporary objects - Three predicate forms —
p(S→L),?p(S→O),!p(O→S) - Type declarations —
.Classemitsrdf:type - Datatypes & languages —
^^xsd:date,@en - Polarity —
+/-prefixes for diff authoring and retraction - Origin tracking — every quad indexed back to its source span
- Elevated statements — automatic
rdf:Statementpattern detection - Round-trip safe —
parse↔generatepreserves data and primary subject
Documentation
- docs/index.md — documentation hub
- docs/Guide.md — one-page mental model and patterns
- docs/Syntax.md — full syntax reference
- docs/API.md — Python API reference
- docs/Architecture.md — design and pipeline
- docs/Parser.md — parser internals
- docs/Subject.md — subject declaration walkthrough
- docs/origin.md — lean origin /
locate()system - docs/polarity.md — diff authoring and retraction
- docs/primary-subject.md — primary-subject rules
- docs/statements.md — elevated
rdf:Statementextraction - docs/Use-Cases.md — domain-specific patterns
- examples/index.md — runnable example documents
- spec/index.md — formal specification and grammars
- CHANGELOG.md — version history
A taste of the format
[my] <tag:alice@example.com,2026:>
# 2026-02-27 {=my:journal-2026-02-27 .my:Event my:date ^^xsd:date}
## A nice day in the park {label}
Mood: [Happy] {my:mood}, energy [8] {my:energyLevel ^^xsd:integer}.
Met [Sam] {+my:sam .my:Person ?my:attendee} at
[Central Park] {+my:central-park ?my:location .my:Place label @en}.
The weather was [Sunny] {my:weather}.
After parsing, the document is a queryable graph keyed by
tag:alice@example.com,2026:journal-2026-02-27 with the relevant relationships
to my:sam, my:central-park, and the typed/tagged literals.
Public API
All functions live at the package root and accept dict-style inputs that compose by spreading:
from mdld_parse import parse, generate, generate_node, merge, locate, render
| Function | Purpose | Returns |
|---|---|---|
parse(text | {'text': ..., 'context': ...}) |
text → quads + origin | {'quads', 'remove', 'statements', 'origin', 'context', 'primary_subject'} |
generate(quads | {'quads': ..., 'context': ..., 'primary_subject': ...}) |
quads → deterministic MD-LD | {'text', 'context'} |
generate_node(..., focus_iri=...) |
quads → MD-LD centered on one IRI | {'text', 'context'} |
merge([doc1, doc2, ...], options) |
merge documents with diff polarity | {'quads', 'remove', 'statements', 'origin', 'context', 'primarySubjects'} |
locate(quad, origin) |
quad → source-map entry | {'blockId', 'range', 'carrierType', ...} or None |
render(text, options) |
MD-LD → HTML + RDFa | {'html', 'context', 'metadata'} |
The dict-spread pattern composes naturally:
parsed = parse({'text': text, 'context': {'ex': 'http://example.org/'}})
canonical = generate(parsed) # parse → generate
node_view = generate_node(parsed, focus_iri='http://example.org/alice')
parse() also accepts the legacy positional form parse(text, options); both
calls return identical results.
For the full reference, see docs/API.md.
Compatibility with RDF tooling
The Python Quad / NamedNode / Literal / BlankNode / Variable /
DefaultGraph types in mdld_parse.utils mirror the RDF/JS data model:
each term has a term_type and value, literals carry language and
datatype, and Quad exposes subject, predicate, object, graph. They
serialize cleanly into common Python RDF stacks (e.g. rdflib) — see
docs/API.md for an rdflib bridge example.
Project layout
mdld_parse/ Python package (parse, generate, merge, locate, render)
tests/ pytest suite covering the MD-LD format
docs/ guides, syntax, architecture, parser internals
examples/ MD-LD example documents (language-agnostic)
spec/ formal specification + ABNF / EBNF grammars
Tests run with pytest from the project root.
Credits
The MD-LD format was designed by @davay42; the JavaScript reference implementation is the companion source for the syntax. This package is an independent Python implementation of the format. See CHANGELOG.md for the version history.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file mdld_parse-0.1.0.tar.gz.
File metadata
- Download URL: mdld_parse-0.1.0.tar.gz
- Upload date:
- Size: 46.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fa78184c9c3258222fa8d63943c06aa10ce929ad302790fc11427fe827a8e20f
|
|
| MD5 |
ec56a62c04b4bafcc19713ad3bf367e1
|
|
| BLAKE2b-256 |
c86d27b5e5193bf528f31df21b993585500cde3f3aeac2e2242a1192cda64a26
|
File details
Details for the file mdld_parse-0.1.0-py3-none-any.whl.
File metadata
- Download URL: mdld_parse-0.1.0-py3-none-any.whl
- Upload date:
- Size: 36.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
df65e0a980ccced915415a63d2d025ffcf304813303f2b9c9115f9f0d54278ff
|
|
| MD5 |
11a1a44f02dcb4ba437da80dfbd258a2
|
|
| BLAKE2b-256 |
9f7297b6471fec9b994071c18c9026d08e7f082e5eea340fb00bd6e25978b24f
|