Skip to main content

Parser, generator, merger, locator, and renderer for MD-LD (Markdown-Linked Data), a human-friendly RDF authoring format

Project description

mdld-parse (Python)

Markdown-Linked Data (MD-LD) — a deterministic, streaming-friendly RDF authoring format that extends CommonMark with explicit {...} annotations.

A Python implementation of the MD-LD format. The format itself was designed by @davay42, whose JavaScript davay42/mdld-parse is the companion reference for the syntax.


Quick start

python -m venv .venv
source .venv/bin/activate
pip install -e .
from mdld_parse import parse

text = """
[ex] <http://example.org/>

# Document {=ex:doc .ex:Article label}

[Alice] {?ex:author =ex:alice .prov:Person ex:firstName label}
[Smith] {ex:lastName}
"""

result = parse({'text': text})

for q in result['quads']:
    print(q.subject.value, q.predicate.value, q.object.value)

The quads list contains RDF/JS-shaped Quad objects (Python dataclasses exposing subject, predicate, object, graph); result['origin'] carries the lean source map; result['primary_subject'] names the document's central entity.


What MD-LD is

MD-LD is RDF you can read, write, diff, and share without leaving Markdown. You add semantics by attaching {...} annotations to value carriers (links, emphasis, headings, list items, code blocks, blockquotes). Strip the annotations and you are left with normal Markdown.

Three things may be in scope at any annotation:

Symbol Meaning
S current subject (an IRI)
O object IRI (from a link, image, or {+iri})
L literal text from the attached value carrier

No subject, no triple. No {...}, no semantics. Nothing is inferred.

Core features

  • Prefix folding — build hierarchical namespaces by composing prefixes
  • Subject declarations{=IRI} and {=#fragment} set context
  • Soft objects{+IRI}, {+#fragment} declare temporary objects
  • Three predicate formsp (S→L), ?p (S→O), !p (O→S)
  • Type declarations.Class emits rdf:type
  • Datatypes & languages^^xsd:date, @en
  • Polarity+/- prefixes for diff authoring and retraction
  • Origin tracking — every quad indexed back to its source span
  • Elevated statements — automatic rdf:Statement pattern detection
  • Round-trip safeparsegenerate preserves data and primary subject

Documentation


A taste of the format

[my] <tag:alice@example.com,2026:>

# 2026-02-27 {=my:journal-2026-02-27 .my:Event my:date ^^xsd:date}

## A nice day in the park {label}

Mood: [Happy] {my:mood}, energy [8] {my:energyLevel ^^xsd:integer}.
Met [Sam] {+my:sam .my:Person ?my:attendee} at
[Central Park] {+my:central-park ?my:location .my:Place label @en}.
The weather was [Sunny] {my:weather}.

After parsing, the document is a queryable graph keyed by tag:alice@example.com,2026:journal-2026-02-27 with the relevant relationships to my:sam, my:central-park, and the typed/tagged literals.


Public API

All functions live at the package root and accept dict-style inputs that compose by spreading:

from mdld_parse import parse, generate, generate_node, merge, locate, render
Function Purpose Returns
parse(text | {'text': ..., 'context': ...}) text → quads + origin {'quads', 'remove', 'statements', 'origin', 'context', 'primary_subject'}
generate(quads | {'quads': ..., 'context': ..., 'primary_subject': ...}) quads → deterministic MD-LD {'text', 'context'}
generate_node(..., focus_iri=...) quads → MD-LD centered on one IRI {'text', 'context'}
merge([doc1, doc2, ...], options) merge documents with diff polarity {'quads', 'remove', 'statements', 'origin', 'context', 'primarySubjects'}
locate(quad, origin) quad → source-map entry {'blockId', 'range', 'carrierType', ...} or None
render(text, options) MD-LD → HTML + RDFa {'html', 'context', 'metadata'}

The dict-spread pattern composes naturally:

parsed = parse({'text': text, 'context': {'ex': 'http://example.org/'}})
canonical = generate(parsed)                       # parse → generate
node_view = generate_node(parsed, focus_iri='http://example.org/alice')

parse() also accepts the legacy positional form parse(text, options); both calls return identical results.

For the full reference, see docs/API.md.


Compatibility with RDF tooling

The Python Quad / NamedNode / Literal / BlankNode / Variable / DefaultGraph types in mdld_parse.utils mirror the RDF/JS data model: each term has a term_type and value, literals carry language and datatype, and Quad exposes subject, predicate, object, graph. They serialize cleanly into common Python RDF stacks (e.g. rdflib) — see docs/API.md for an rdflib bridge example.


Project layout

mdld_parse/        Python package (parse, generate, merge, locate, render)
tests/             pytest suite covering the MD-LD format
docs/              guides, syntax, architecture, parser internals
examples/          MD-LD example documents (language-agnostic)
spec/              formal specification + ABNF / EBNF grammars

Tests run with pytest from the project root.


Credits

The MD-LD format was designed by @davay42; the JavaScript reference implementation is the companion source for the syntax. This package is an independent Python implementation of the format. See CHANGELOG.md for the version history.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mdld_parse-0.1.1.tar.gz (46.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mdld_parse-0.1.1-py3-none-any.whl (36.2 kB view details)

Uploaded Python 3

File details

Details for the file mdld_parse-0.1.1.tar.gz.

File metadata

  • Download URL: mdld_parse-0.1.1.tar.gz
  • Upload date:
  • Size: 46.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for mdld_parse-0.1.1.tar.gz
Algorithm Hash digest
SHA256 ceb745da946686419534aced86796f135f06969385bc5f556b3c8780f5ae1d68
MD5 5ad1c8510bd18c0bf7d33696c6cd4f76
BLAKE2b-256 356092f4d6b6ce727959d5d2fdbd0d98e1e16bb5a991c9c58dd707e258a1832e

See more details on using hashes here.

File details

Details for the file mdld_parse-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: mdld_parse-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 36.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for mdld_parse-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 6499ac8439a8c80665a17ac8e9ede7c0c122b7f318f56a782c0eba663445447a
MD5 45e023121b1ba58249f2185279f29b50
BLAKE2b-256 2829ceac9b76cf1b97759b7a8dfa86a3bd61cfefa2af8a72c6ea35390c319ab0

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page