A bounded-time, Pandoc-leaning Markdown parser with GFM, Extra/kramdown, math, fenced divs, and XHTML output.
Project description
xhtmlmd
A Rust Markdown parser and XHTML renderer.
The parser is tree-oriented. It preserves the structure and attributes needed for XHTML output, but it does not try to round-trip source text. The dialect is CommonMark/GFM for the core and GFM features, with Pandoc-leaning choices where extension families disagree.
xhtmlmd is largely implemented using AI, except for the tests. The tests are largely adapted from cmark-gfm, PHP Markdown Extra, kramdown, and Mistlefoot. Credit for xhtmlmd really belongs to the authors of these tests, and of the CommonMark docs, which is where the hard work was done.
Implemented syntax
- Core block syntax: paragraphs, ATX/setext headings, thematic breaks, block quotes, ordered/unordered lists, indented code, raw HTML, link reference definitions.
- GFM: pipe tables with alignment, task lists,
~~x~~strikethrough, angle and bare autolinks, plus opt-in tagfiltering. - Code: backtick/tilde fenced code blocks, info strings, and Pandoc-style code attributes.
- HTML-in-Markdown: block containers opened with
markdown="1"; the control attribute is stripped, indented code blocks are disabled inside the container, and fenced code is the code-block syntax there. - Math: four modes:
bracketsfor\(...\)and\[...\],dollarsfor those plus$...$and$$...$$using Pandoc's non-space/digit dollar rules,onto preserve\(...\)and\[...\]delimiters for client-side renderers such as KaTeX, andoff. Brackets mode is the default. - Attributes and inline spans: Pandoc/kramdown-style
{#id .class key="value"}, block IALs{: ...}, span IALs, ALDs such as{:note: #id .class}with references, superscript^x^, subscript~x~, and highlight==x==. - Definition lists: PHP Markdown Extra/Pandoc-style
Termfollowed by: definitionor~ definition. - Footnotes:
[^id]references to defined[^id]:definitions with indented continuation blocks. - Abbreviations:
*[HTML]: Hyper Text Markup Languagedefinitions render matching text as<abbr>. - Fenced divs: Pandoc/Quarto/Djot-style
:::containers with attributes or a single class word.
Usage
Install via pip to get both the Python API and the native xhtmlmd CLI:
pip install xhtmlmd
The CLI reads Markdown from stdin or from an optional file path and writes an XHTML fragment to stdout:
echo '# Hello' | xhtmlmd
xhtmlmd input.md > out.xhtml
xhtmlmd --math=on input.md > out.xhtml
xhtmlmd --math=dollars input.md > out.xhtml
Python API:
from xhtmlmd import to_xhtml
html = to_xhtml(r"\(x^2\)")
html_for_katex = to_xhtml(r"\(x^2\)", math="on")
html_with_dollars = to_xhtml("$x$", math="dollars")
Callbacks
Python callers can override rendered nodes with callbacks. Each callback receives a node dict and the default XHTML for that node. Return None to keep the default, or return replacement XHTML.
Callback names:
- Blocks:
paragraph,heading,block_quote,list,definition_list,code_block,html_block,html_container,thematic_break,table,div,math_block - Inlines:
text,soft_break,hard_break,emph,strong,strike,superscript,subscript,highlight,code,link,image,autolink,abbr,html_inline,math_inline,footnote_ref,span
from fastpylight import highlight
from xhtmlmd import to_xhtml
def highlight_code(node, default_html):
if node["lang"] != "python": return None
return highlight(node["text"], node["lang"]) + "\n"
html = to_xhtml(markdown, callbacks={"code_block": highlight_code})
Callbacks can also render bracket math as MathML:
from math_core import LatexToMathML
from xhtmlmd import to_xhtml
mathml = LatexToMathML()
def render_math(node, default_html):
html = mathml.convert_with_local_counter(node["tex"], displaystyle=node["type"] == "math_block")
return html + ("\n" if node["type"] == "math_block" else "")
html = to_xhtml(markdown, callbacks={"math_inline": render_math, "math_block": render_math})
Command-line usage (the xhtmlmd script is installed with the package):
xhtmlmd input.md > out.xhtml
cat input.md | xhtmlmd --math=dollars
Parsing strategy
The parser uses the two-phase strategy described in the CommonMark parsing-strategy appendix: first build the block tree and collect link reference definitions, then parse raw inline text with the completed reference table. It tracks visual columns and byte offsets for each line and builds blocks with an arena-backed open-container stack. The stack has typed nodes for block quotes, lists, paragraphs/setext candidates, fenced and indented code, raw HTML, GFM table candidates, math, footnote definitions, definition lists, fenced divs, and markdown-in-HTML containers. Inlines are scanned into atoms, bracket openers, and delimiter runs; links/images/spans resolve through the bracket stack, while emphasis/strong/strikethrough resolve through the delimiter stack. Inputs that can otherwise explode have explicit bounds: inline nesting, block/container nesting, link label length, and link parenthesis nesting.
The link parser uses raw reference-label scanning, bounded parenthesis nesting, bounded link labels, URI escaping for rendered href/src attributes, and a plain-text fast path for inputs with no possible inline constructs. This keeps adversarial inputs such as deeply nested brackets, long blockquote runs, repeated ![[](), and unclosed comments in predictable time.
Raw HTML is preserved by default. Supported raw HTML container tags such as div, section, table, svg, math, and custom elements stay open across blank lines until their matching close tag, with same-tag nesting counted; void and self-closing tags do not open balanced containers. Markdown inside raw HTML remains raw unless the open tag that starts the Markdown block uses markdown="1"; this crate does not recursively look for markdown controls inside otherwise-raw HTML. Options::default().tagfilter is false; enabling it applies GFM-style filtering for tags such as script, style, xmp, and textarea. This is compatibility and extra protection, not a replacement for sanitizing untrusted rendered HTML.
Tests
maturin develop && pytest -q
The spec-conformance suite is tests/test_conformance.py: it renders the fixtures under tests/source/ and compares normalized HTML trees. Run just that file with pytest tests/test_conformance.py -v to see per-example ids.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file xhtmlmd-0.1.5.tar.gz.
File metadata
- Download URL: xhtmlmd-0.1.5.tar.gz
- Upload date:
- Size: 126.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
696b5bf15e418856f59a172620ee4dee8a80c5cde94cccd402f1e5f6b149da61
|
|
| MD5 |
a87b5221f0e0d4705cf305a443545fd8
|
|
| BLAKE2b-256 |
ed8a5c82f5f4af994068a3b28628746f5fd698e22acf3cb8f0bb36256fcdab43
|
Provenance
The following attestation bundles were made for xhtmlmd-0.1.5.tar.gz:
Publisher:
ci.yml on AnswerDotAI/xhtmlmd
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
xhtmlmd-0.1.5.tar.gz -
Subject digest:
696b5bf15e418856f59a172620ee4dee8a80c5cde94cccd402f1e5f6b149da61 - Sigstore transparency entry: 1894364403
- Sigstore integration time:
-
Permalink:
AnswerDotAI/xhtmlmd@b192959f45352a1b525aac4ac47445f2fd4585d1 -
Branch / Tag:
refs/tags/v0.1.5 - Owner: https://github.com/AnswerDotAI
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
ci.yml@b192959f45352a1b525aac4ac47445f2fd4585d1 -
Trigger Event:
push
-
Statement type:
File details
Details for the file xhtmlmd-0.1.5-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: xhtmlmd-0.1.5-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 424.7 kB
- Tags: CPython 3.13, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
446c618bb0554efa226df0c426fbc3885b1cbcfbc872e04ef7383031ac252183
|
|
| MD5 |
64db40d4c55ef094355e37bf4f36ce33
|
|
| BLAKE2b-256 |
1ddbc78566cac654641986652e65accc090610fd82f6c1e47ff2885800cb3233
|
Provenance
The following attestation bundles were made for xhtmlmd-0.1.5-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:
Publisher:
ci.yml on AnswerDotAI/xhtmlmd
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
xhtmlmd-0.1.5-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl -
Subject digest:
446c618bb0554efa226df0c426fbc3885b1cbcfbc872e04ef7383031ac252183 - Sigstore transparency entry: 1894366376
- Sigstore integration time:
-
Permalink:
AnswerDotAI/xhtmlmd@b192959f45352a1b525aac4ac47445f2fd4585d1 -
Branch / Tag:
refs/tags/v0.1.5 - Owner: https://github.com/AnswerDotAI
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
ci.yml@b192959f45352a1b525aac4ac47445f2fd4585d1 -
Trigger Event:
push
-
Statement type:
File details
Details for the file xhtmlmd-0.1.5-cp313-cp313-macosx_11_0_arm64.whl.
File metadata
- Download URL: xhtmlmd-0.1.5-cp313-cp313-macosx_11_0_arm64.whl
- Upload date:
- Size: 382.5 kB
- Tags: CPython 3.13, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
294dbfedd81eeae8a80b969a5324907bce1f40d4ddd27a5ee4e052ee1eb6fc0f
|
|
| MD5 |
a2bf09efbe30ae34cbf8a936d89a5f2d
|
|
| BLAKE2b-256 |
fbf7522b27fd794bb715ff6b4376d595ec271874751868d36e609a9721e55eef
|
Provenance
The following attestation bundles were made for xhtmlmd-0.1.5-cp313-cp313-macosx_11_0_arm64.whl:
Publisher:
ci.yml on AnswerDotAI/xhtmlmd
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
xhtmlmd-0.1.5-cp313-cp313-macosx_11_0_arm64.whl -
Subject digest:
294dbfedd81eeae8a80b969a5324907bce1f40d4ddd27a5ee4e052ee1eb6fc0f - Sigstore transparency entry: 1894366015
- Sigstore integration time:
-
Permalink:
AnswerDotAI/xhtmlmd@b192959f45352a1b525aac4ac47445f2fd4585d1 -
Branch / Tag:
refs/tags/v0.1.5 - Owner: https://github.com/AnswerDotAI
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
ci.yml@b192959f45352a1b525aac4ac47445f2fd4585d1 -
Trigger Event:
push
-
Statement type:
File details
Details for the file xhtmlmd-0.1.5-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: xhtmlmd-0.1.5-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 424.9 kB
- Tags: CPython 3.12, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
681e9e61aa1ab799e485891be598bd055ddda85a3a1ba9fa3a92974466f7cddf
|
|
| MD5 |
76583480fd4ccfc4cb3cf71b3788f486
|
|
| BLAKE2b-256 |
3bfa125f7c8cd9ee91237fea1b2d7ebf680339d01437f81b04b7578a5df0821d
|
Provenance
The following attestation bundles were made for xhtmlmd-0.1.5-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:
Publisher:
ci.yml on AnswerDotAI/xhtmlmd
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
xhtmlmd-0.1.5-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl -
Subject digest:
681e9e61aa1ab799e485891be598bd055ddda85a3a1ba9fa3a92974466f7cddf - Sigstore transparency entry: 1894366847
- Sigstore integration time:
-
Permalink:
AnswerDotAI/xhtmlmd@b192959f45352a1b525aac4ac47445f2fd4585d1 -
Branch / Tag:
refs/tags/v0.1.5 - Owner: https://github.com/AnswerDotAI
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
ci.yml@b192959f45352a1b525aac4ac47445f2fd4585d1 -
Trigger Event:
push
-
Statement type:
File details
Details for the file xhtmlmd-0.1.5-cp312-cp312-macosx_11_0_arm64.whl.
File metadata
- Download URL: xhtmlmd-0.1.5-cp312-cp312-macosx_11_0_arm64.whl
- Upload date:
- Size: 382.6 kB
- Tags: CPython 3.12, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
83c65d78a077225dc1b0fbaa0e797ff39f9fd49259293d66734c9d5a9a6e889b
|
|
| MD5 |
57856fc3e5956fa9100517eb39f0e7f4
|
|
| BLAKE2b-256 |
f37395242163223499faa02ad8a52d1e1fe1d77b8e23b97e9e6473afd5372782
|
Provenance
The following attestation bundles were made for xhtmlmd-0.1.5-cp312-cp312-macosx_11_0_arm64.whl:
Publisher:
ci.yml on AnswerDotAI/xhtmlmd
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
xhtmlmd-0.1.5-cp312-cp312-macosx_11_0_arm64.whl -
Subject digest:
83c65d78a077225dc1b0fbaa0e797ff39f9fd49259293d66734c9d5a9a6e889b - Sigstore transparency entry: 1894365043
- Sigstore integration time:
-
Permalink:
AnswerDotAI/xhtmlmd@b192959f45352a1b525aac4ac47445f2fd4585d1 -
Branch / Tag:
refs/tags/v0.1.5 - Owner: https://github.com/AnswerDotAI
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
ci.yml@b192959f45352a1b525aac4ac47445f2fd4585d1 -
Trigger Event:
push
-
Statement type:
File details
Details for the file xhtmlmd-0.1.5-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: xhtmlmd-0.1.5-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 425.6 kB
- Tags: CPython 3.11, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d327784a5cb588e9cc2229b430aa77833302777597e6526ce9692d6bc849b2fb
|
|
| MD5 |
42a22bd15323d48ff9931253de667098
|
|
| BLAKE2b-256 |
290a3f352968d1f82658c46fd5662ab5e28e9012ebc7d021c70c229232977539
|
Provenance
The following attestation bundles were made for xhtmlmd-0.1.5-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:
Publisher:
ci.yml on AnswerDotAI/xhtmlmd
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
xhtmlmd-0.1.5-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl -
Subject digest:
d327784a5cb588e9cc2229b430aa77833302777597e6526ce9692d6bc849b2fb - Sigstore transparency entry: 1894364787
- Sigstore integration time:
-
Permalink:
AnswerDotAI/xhtmlmd@b192959f45352a1b525aac4ac47445f2fd4585d1 -
Branch / Tag:
refs/tags/v0.1.5 - Owner: https://github.com/AnswerDotAI
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
ci.yml@b192959f45352a1b525aac4ac47445f2fd4585d1 -
Trigger Event:
push
-
Statement type:
File details
Details for the file xhtmlmd-0.1.5-cp311-cp311-macosx_11_0_arm64.whl.
File metadata
- Download URL: xhtmlmd-0.1.5-cp311-cp311-macosx_11_0_arm64.whl
- Upload date:
- Size: 384.5 kB
- Tags: CPython 3.11, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8bb2d3f9987fdbbfd942e39a5b6c6369720f8332c8a9bf2a0d75bb682a3d30ca
|
|
| MD5 |
1dde7e0cddf2a28bf102775a465e90f9
|
|
| BLAKE2b-256 |
40a45e73d1131e33368c3230d1f9bd05b84f93ac2bb8375849b6011bcb7a0213
|
Provenance
The following attestation bundles were made for xhtmlmd-0.1.5-cp311-cp311-macosx_11_0_arm64.whl:
Publisher:
ci.yml on AnswerDotAI/xhtmlmd
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
xhtmlmd-0.1.5-cp311-cp311-macosx_11_0_arm64.whl -
Subject digest:
8bb2d3f9987fdbbfd942e39a5b6c6369720f8332c8a9bf2a0d75bb682a3d30ca - Sigstore transparency entry: 1894364579
- Sigstore integration time:
-
Permalink:
AnswerDotAI/xhtmlmd@b192959f45352a1b525aac4ac47445f2fd4585d1 -
Branch / Tag:
refs/tags/v0.1.5 - Owner: https://github.com/AnswerDotAI
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
ci.yml@b192959f45352a1b525aac4ac47445f2fd4585d1 -
Trigger Event:
push
-
Statement type:
File details
Details for the file xhtmlmd-0.1.5-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: xhtmlmd-0.1.5-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 426.0 kB
- Tags: CPython 3.10, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e2fa045dab9344d031141b797c72c0658595ef025553116ac8fa8f806e2cc313
|
|
| MD5 |
17f7c3bc480253c7a6e8ae08ed0316b8
|
|
| BLAKE2b-256 |
768fa4fde9adb9113a628efc6fe1107c282ad7a21e31f6ab1d560e97d8cb38c7
|
Provenance
The following attestation bundles were made for xhtmlmd-0.1.5-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:
Publisher:
ci.yml on AnswerDotAI/xhtmlmd
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
xhtmlmd-0.1.5-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl -
Subject digest:
e2fa045dab9344d031141b797c72c0658595ef025553116ac8fa8f806e2cc313 - Sigstore transparency entry: 1894365385
- Sigstore integration time:
-
Permalink:
AnswerDotAI/xhtmlmd@b192959f45352a1b525aac4ac47445f2fd4585d1 -
Branch / Tag:
refs/tags/v0.1.5 - Owner: https://github.com/AnswerDotAI
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
ci.yml@b192959f45352a1b525aac4ac47445f2fd4585d1 -
Trigger Event:
push
-
Statement type:
File details
Details for the file xhtmlmd-0.1.5-cp310-cp310-macosx_11_0_arm64.whl.
File metadata
- Download URL: xhtmlmd-0.1.5-cp310-cp310-macosx_11_0_arm64.whl
- Upload date:
- Size: 385.0 kB
- Tags: CPython 3.10, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
90a31a9f3c71ad7715b6f8e0ebfe9e3e7354c155b8115b75eb7b214191878232
|
|
| MD5 |
dfb9a5b45aebbda3b3ff645e1f874796
|
|
| BLAKE2b-256 |
20b731fe2d2dae24114cbb01ff97b189f5e6e4821e52836f04d814524aed641a
|
Provenance
The following attestation bundles were made for xhtmlmd-0.1.5-cp310-cp310-macosx_11_0_arm64.whl:
Publisher:
ci.yml on AnswerDotAI/xhtmlmd
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
xhtmlmd-0.1.5-cp310-cp310-macosx_11_0_arm64.whl -
Subject digest:
90a31a9f3c71ad7715b6f8e0ebfe9e3e7354c155b8115b75eb7b214191878232 - Sigstore transparency entry: 1894365680
- Sigstore integration time:
-
Permalink:
AnswerDotAI/xhtmlmd@b192959f45352a1b525aac4ac47445f2fd4585d1 -
Branch / Tag:
refs/tags/v0.1.5 - Owner: https://github.com/AnswerDotAI
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
ci.yml@b192959f45352a1b525aac4ac47445f2fd4585d1 -
Trigger Event:
push
-
Statement type: