Modern Markdown parser for Python 3.14t — CommonMark compliant, free-threading ready, typed AST
Project description
ฅᨐฅ Patitas
The secure, typed Markdown parser for modern Python.
from patitas import Markdown
md = Markdown()
html = md("# Hello **World**")
What is Patitas?
Patitas is a pure-Python Markdown parser that parses to a typed AST and renders to HTML. It's CommonMark 0.31.2 compliant, has zero runtime dependencies, and is built for Python 3.14+.
What it does
| Function | Description |
|---|---|
parse(source) |
Parse Markdown to typed AST |
parse_frontmatter(content) |
Parse YAML frontmatter to (metadata, body) |
parse_notebook(content, source_path?) |
Parse Jupyter .ipynb to (markdown, metadata) |
parse_incremental(new, prev, ...) |
Re-parse only the changed region (O(change)) |
render(doc) |
Render AST to HTML |
render_llm(doc) |
Render AST to LLM-friendly plain text (no HTML) |
sanitize(doc, policy) |
Strip HTML, dangerous URLs, zero-width chars |
extract_text(node) |
Extract plain text from any AST node |
extract_excerpt(ast, source, ...) |
Structurally correct excerpt from AST (list previews, meta) |
extract_meta_description(ast, source) |
Meta description from first paragraph/heading |
extract_body(content) |
Strip --- delimited frontmatter block (no YAML parse) |
Markdown() |
All-in-one parser and renderer |
What's good about it
- ReDoS-proof — O(n) finite state machine lexer, no regex backtracking. Safe for untrusted input in web apps and APIs.
- Typed AST — Frozen dataclasses (
Heading,Paragraph,Strong, etc.) with IDE autocomplete and type checking. - CommonMark — Full 0.31.2 spec compliance (652 examples).
- Incremental parsing — Re-parse only changed blocks; ~200x faster for small edits than full re-parse.
- Free-threading native — Frozen AST,
ContextVarconfig, no shared mutable state. 1,000 documents parse in parallel with near-linear thread scaling on 3.14t — no locks, no special API. - LLM-safe —
render_llm+ composablesanitizepolicies for RAG, retrieval, safe context. - Directives — MyST-style blocks (admonition, dropdown, tabs) plus custom directives.
- Plugins — Tables, footnotes, math, strikethrough, task lists.
- Minimal dependencies — PyYAML for frontmatter; core parser is pure Python.
Installation
pip install patitas
Requires Python 3.14+
Optional extras:
pip install patitas[syntax] # Syntax highlighting via Rosettes
pip install patitas[all] # All optional features
Quick Start
Parse and render
from patitas import parse, render
doc = parse("# Hello **World**")
html = render(doc)
# <h1 id="hello-world">Hello <strong>World</strong></h1>
Frontmatter
Parse YAML frontmatter from Markdown or other content, returning a (metadata, body) tuple:
from patitas import parse_frontmatter, extract_body
content = """---
title: Hello
weight: 10
---
# Body content
"""
metadata, body = parse_frontmatter(content)
# metadata: {"title": "Hello", "weight": 10.0}
# body: "# Body content"
# When YAML is broken, extract_body strips the --- block without parsing
body_only = extract_body(content)
Notebook support
Parse Jupyter notebooks (.ipynb) to Markdown content and metadata — stdlib JSON only:
from patitas import parse_notebook
with open("demo.ipynb") as f:
content, metadata = parse_notebook(f.read(), "demo.ipynb")
# content: Markdown string (cells → fenced code, outputs → HTML)
# metadata: title, type, notebook{kernel_name, cell_count}, etc.
Security
Patitas is immune to ReDoS attacks.
Traditional Markdown parsers use regex patterns vulnerable to catastrophic backtracking:
# Malicious input that can freeze regex-based parsers
evil = "a](" + "\\)" * 10000
# Patitas: completes in milliseconds (O(n) guaranteed)
Patitas uses a hand-written finite state machine lexer:
- Single character lookahead — No backtracking, ever
- Linear time guaranteed — Processing time scales with input length
- Safe for untrusted input — Use in web apps, APIs, user-facing tools
Learn more about Patitas security →
Performance
-
652 CommonMark examples — ~26ms single-threaded
-
Incremental parsing — For a 1-char edit in a ~100KB doc,
parse_incrementalis ~200x faster than full re-parse (~160µs vs ~32ms) -
Parallel scaling — Near-linear thread scaling under Python 3.14t free-threading. Run
python benchmarks/benchmark_parallel.pyto see results on your machine. Example on 8-core:Threads Time Speedup 1 1.52s 1.00x 2 0.79s 1.92x 4 0.41s 3.71x 8 0.23s 6.61x
# From repo (after uv sync --group dev):
python benchmarks/benchmark_vs_mistune.py
python benchmarks/benchmark_parallel.py # Free-threading scaling
pytest benchmarks/benchmark_vs_mistune.py benchmarks/benchmark_incremental.py benchmarks/benchmark_directives.py benchmarks/benchmark_scaling.py benchmarks/benchmark_excerpt.py -v --benchmark-only --benchmark-group-by=group
See benchmarks/README.md for the full suite (pipelines, phase-breakdown, CI threshold checks).
Usage
Typed AST — IDE autocomplete, catch errors at dev time
from patitas import parse
from patitas.nodes import Heading, Paragraph, Strong
doc = parse("# Hello **World**")
heading = doc.children[0]
# Full type safety
assert isinstance(heading, Heading)
assert heading.level == 1
# IDE knows the types!
for child in heading.children:
if isinstance(child, Strong):
print(f"Bold text: {child.children}")
All nodes are @dataclass(frozen=True, slots=True) — immutable and memory-efficient.
Directives — MyST-style blocks
:::{note}
This is a note admonition.
:::
:::{warning}
This is a warning.
:::
:::{dropdown} Click to expand
Hidden content here.
:::
:::{tab-set}
:::{tab-item} Python
Python code here.
:::
:::{tab-item} JavaScript
JavaScript code here.
:::
:::
Custom Directives — Extend with your own
from patitas import Markdown, create_registry_with_defaults
from patitas.directives.decorator import directive
# Define a custom directive with the @directive decorator
@directive("alert")
def render_alert(node, children: str, sb) -> None:
sb.append(f'<div class="alert">{children}</div>')
# Extend defaults with your directive
builder = create_registry_with_defaults() # Has admonition, dropdown, tabs
builder.register(render_alert())
# Use it
md = Markdown(directive_registry=builder.build())
html = md(":::{alert} This is important!\n:::")
Syntax Highlighting
With pip install patitas[syntax]:
from patitas import Markdown
md = Markdown(highlight=True)
html = md("""
```python
def hello():
print("Highlighted!")
""")
Uses [Rosettes](https://github.com/lbliii/rosettes) for O(n) highlighting.
</details>
<details>
<summary><strong>Free-Threading</strong> — Python 3.14t</summary>
```python
from concurrent.futures import ThreadPoolExecutor
from patitas import parse
documents = ["# Doc " + str(i) for i in range(1000)]
with ThreadPoolExecutor() as executor:
# Safe to parse in parallel — no shared mutable state
results = list(executor.map(parse, documents))
Patitas is designed for Python 3.14t's free-threading mode (PEP 703).
LLM Safety — Sanitize and render for RAG, retrieval
When sending Markdown to an LLM, sanitize untrusted content and render to plain text:
from patitas import parse, sanitize, render_llm
from patitas.sanitize import llm_safe
doc = parse(user_content)
clean = sanitize(doc, policy=llm_safe) # Strip HTML, dangerous URLs, zero-width chars
safe_text = render_llm(clean, source=user_content)
Pre-built policies: llm_safe, web_safe (alias), strict. Compose with |.
Migrate from mistune
Same API — swap the import:
from patitas import Markdown
md = Markdown()
html = md(source)
The Bengal Ecosystem
A structured reactive stack — every layer written in pure Python for 3.14t free-threading.
| ᓚᘏᗢ | Bengal | Static site generator | Docs |
| ∿∿ | Purr | Content runtime | — |
| ⌁⌁ | Chirp | Web framework | Docs |
| =^..^= | Pounce | ASGI server | Docs |
| )彡 | Kida | Template engine | Docs |
| ฅᨐฅ | Patitas | Markdown parser ← You are here | Docs |
| ⌾⌾⌾ | Rosettes | Syntax highlighter | Docs |
Python-native. Free-threading ready. No npm required.
Development
git clone https://github.com/lbliii/patitas.git
cd patitas
uv sync --group dev
pytest
Run benchmarks (after uv sync --group dev):
python benchmarks/benchmark_vs_mistune.py
python benchmarks/benchmark_parallel.py # Free-threading scaling demo
pytest benchmarks/benchmark_*.py -v --benchmark-only --benchmark-group-by=group # Full suite
License
MIT License — see LICENSE for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file patitas-0.3.5.tar.gz.
File metadata
- Download URL: patitas-0.3.5.tar.gz
- Upload date:
- Size: 221.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5944be9ef0fd2573852c785ee80039412cea02fdc1296abf9b5b1d27fe862eb1
|
|
| MD5 |
7ea91c8eee240076618376f1ff8d69e9
|
|
| BLAKE2b-256 |
a9f25515ccb29656ba4285dca0325fa581229f945815975ed9735eb5a96fa428
|
Provenance
The following attestation bundles were made for patitas-0.3.5.tar.gz:
Publisher:
python-publish.yml on lbliii/patitas
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
patitas-0.3.5.tar.gz -
Subject digest:
5944be9ef0fd2573852c785ee80039412cea02fdc1296abf9b5b1d27fe862eb1 - Sigstore transparency entry: 1068325310
- Sigstore integration time:
-
Permalink:
lbliii/patitas@16b16dc78377b7e22a636fcad887f6eb0c462ea0 -
Branch / Tag:
refs/tags/v0.3.5 - Owner: https://github.com/lbliii
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@16b16dc78377b7e22a636fcad887f6eb0c462ea0 -
Trigger Event:
release
-
Statement type:
File details
Details for the file patitas-0.3.5-py3-none-any.whl.
File metadata
- Download URL: patitas-0.3.5-py3-none-any.whl
- Upload date:
- Size: 214.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
da841c65f036ec159065bf5c302bf416ba95e1511b8243676e8ceb390205271b
|
|
| MD5 |
6141d310b4f9faa25e356897bf51ac06
|
|
| BLAKE2b-256 |
a500a73e99e95487add89d890ee3703c8143ee8e7caeccfab8cb90792d29709f
|
Provenance
The following attestation bundles were made for patitas-0.3.5-py3-none-any.whl:
Publisher:
python-publish.yml on lbliii/patitas
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
patitas-0.3.5-py3-none-any.whl -
Subject digest:
da841c65f036ec159065bf5c302bf416ba95e1511b8243676e8ceb390205271b - Sigstore transparency entry: 1068325383
- Sigstore integration time:
-
Permalink:
lbliii/patitas@16b16dc78377b7e22a636fcad887f6eb0c462ea0 -
Branch / Tag:
refs/tags/v0.3.5 - Owner: https://github.com/lbliii
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@16b16dc78377b7e22a636fcad887f6eb0c462ea0 -
Trigger Event:
release
-
Statement type: