Skip to main content

Modern Markdown parser for Python 3.14t — CommonMark compliant, free-threading ready, typed AST

Project description

ฅᨐฅ Patitas

PyPI version Build Status Python 3.14+ License: MIT CommonMark ReDoS Safe

The secure, typed Markdown parser for modern Python.

from patitas import Markdown

md = Markdown()
html = md("# Hello **World**")

What is Patitas?

Patitas is a pure-Python Markdown parser that parses to a typed AST and renders to HTML. It's CommonMark 0.31.2 compliant, has zero runtime dependencies, and is built for Python 3.14+.


What it does

Function Description
parse(source) Parse Markdown to typed AST
parse_frontmatter(content) Parse YAML frontmatter to (metadata, body)
parse_notebook(content, source_path?) Parse Jupyter .ipynb to (markdown, metadata)
parse_incremental(new, prev, ...) Re-parse only the changed region (O(change))
render(doc) Render AST to HTML
render_llm(doc) Render AST to LLM-friendly plain text (no HTML)
sanitize(doc, policy) Strip HTML, dangerous URLs, zero-width chars
extract_text(node) Extract plain text from any AST node
extract_excerpt(ast, source, ...) Structurally correct excerpt from AST (list previews, meta)
extract_meta_description(ast, source) Meta description from first paragraph/heading
extract_body(content) Strip --- delimited frontmatter block (no YAML parse)
Markdown() All-in-one parser and renderer

What's good about it

  • ReDoS-proof — O(n) finite state machine lexer, no regex backtracking. Safe for untrusted input in web apps and APIs.
  • Typed AST — Frozen dataclasses (Heading, Paragraph, Strong, etc.) with IDE autocomplete and type checking.
  • CommonMark — Full 0.31.2 spec compliance (652 examples).
  • Incremental parsing — Re-parse only changed blocks; ~200x faster for small edits than full re-parse.
  • Free-threading native — Frozen AST, ContextVar config, no shared mutable state. 1,000 documents parse in parallel with near-linear thread scaling on 3.14t — no locks, no special API.
  • LLM-saferender_llm + composable sanitize policies for RAG, retrieval, safe context.
  • Directives — MyST-style blocks (admonition, dropdown, tabs) plus custom directives.
  • Plugins — Tables, footnotes, math, strikethrough, task lists.
  • Minimal dependencies — PyYAML for frontmatter; core parser is pure Python.

Installation

pip install patitas

Requires Python 3.14+

Optional extras:

pip install patitas[syntax]      # Syntax highlighting via Rosettes
pip install patitas[all]         # All optional features

Quick Start

Parse and render

from patitas import parse, render

doc = parse("# Hello **World**")
html = render(doc)
# <h1 id="hello-world">Hello <strong>World</strong></h1>

Frontmatter

Parse YAML frontmatter from Markdown or other content, returning a (metadata, body) tuple:

from patitas import parse_frontmatter, extract_body

content = """---
title: Hello
weight: 10
---
# Body content
"""
metadata, body = parse_frontmatter(content)
# metadata: {"title": "Hello", "weight": 10.0}
# body: "# Body content"

# When YAML is broken, extract_body strips the --- block without parsing
body_only = extract_body(content)

Notebook support

Parse Jupyter notebooks (.ipynb) to Markdown content and metadata — stdlib JSON only:

from patitas import parse_notebook

with open("demo.ipynb") as f:
    content, metadata = parse_notebook(f.read(), "demo.ipynb")

# content: Markdown string (cells → fenced code, outputs → HTML)
# metadata: title, type, notebook{kernel_name, cell_count}, etc.

Security

Patitas is immune to ReDoS attacks.

Traditional Markdown parsers use regex patterns vulnerable to catastrophic backtracking:

# Malicious input that can freeze regex-based parsers
evil = "a](" + "\\)" * 10000

# Patitas: completes in milliseconds (O(n) guaranteed)

Patitas uses a hand-written finite state machine lexer:

  • Single character lookahead — No backtracking, ever
  • Linear time guaranteed — Processing time scales with input length
  • Safe for untrusted input — Use in web apps, APIs, user-facing tools

Learn more about Patitas security →


Performance

  • 652 CommonMark examples — ~26ms single-threaded

  • Incremental parsing — For a 1-char edit in a ~100KB doc, parse_incremental is ~200x faster than full re-parse (~160µs vs ~32ms)

  • Parallel scaling — Near-linear thread scaling under Python 3.14t free-threading. Run python benchmarks/benchmark_parallel.py to see results on your machine. Example on 8-core:

      Threads    Time      Speedup
      1          1.52s     1.00x
      2          0.79s     1.92x
      4          0.41s     3.71x
      8          0.23s     6.61x
    
# From repo (after uv sync --group dev):
python benchmarks/benchmark_vs_mistune.py
python benchmarks/benchmark_parallel.py   # Free-threading scaling
pytest benchmarks/benchmark_vs_mistune.py benchmarks/benchmark_incremental.py benchmarks/benchmark_directives.py benchmarks/benchmark_scaling.py benchmarks/benchmark_excerpt.py -v --benchmark-only --benchmark-group-by=group

See benchmarks/README.md for the full suite (pipelines, phase-breakdown, CI threshold checks).


Usage

Typed AST — IDE autocomplete, catch errors at dev time
from patitas import parse
from patitas.nodes import Heading, Paragraph, Strong

doc = parse("# Hello **World**")
heading = doc.children[0]

# Full type safety
assert isinstance(heading, Heading)
assert heading.level == 1

# IDE knows the types!
for child in heading.children:
    if isinstance(child, Strong):
        print(f"Bold text: {child.children}")

All nodes are @dataclass(frozen=True, slots=True) — immutable and memory-efficient.

Directives — MyST-style blocks
:::{note}
This is a note admonition.
:::

:::{warning}
This is a warning.
:::

:::{dropdown} Click to expand
Hidden content here.
:::

:::{tab-set}

:::{tab-item} Python
Python code here.
:::

:::{tab-item} JavaScript
JavaScript code here.
:::

:::
Custom Directives — Extend with your own
from patitas import Markdown, create_registry_with_defaults
from patitas.directives.decorator import directive

# Define a custom directive with the @directive decorator
@directive("alert")
def render_alert(node, children: str, sb) -> None:
    sb.append(f'<div class="alert">{children}</div>')

# Extend defaults with your directive
builder = create_registry_with_defaults()  # Has admonition, dropdown, tabs
builder.register(render_alert())

# Use it
md = Markdown(directive_registry=builder.build())
html = md(":::{alert} This is important!\n:::")
Syntax Highlighting

With pip install patitas[syntax]:

from patitas import Markdown

md = Markdown(highlight=True)

html = md("""
```python
def hello():
    print("Highlighted!")

""")


Uses [Rosettes](https://github.com/lbliii/rosettes) for O(n) highlighting.

</details>

<details>
<summary><strong>Free-Threading</strong> — Python 3.14t</summary>

```python
from concurrent.futures import ThreadPoolExecutor
from patitas import parse

documents = ["# Doc " + str(i) for i in range(1000)]

with ThreadPoolExecutor() as executor:
    # Safe to parse in parallel — no shared mutable state
    results = list(executor.map(parse, documents))

Patitas is designed for Python 3.14t's free-threading mode (PEP 703).

LLM Safety — Sanitize and render for RAG, retrieval

When sending Markdown to an LLM, sanitize untrusted content and render to plain text:

from patitas import parse, sanitize, render_llm
from patitas.sanitize import llm_safe

doc = parse(user_content)
clean = sanitize(doc, policy=llm_safe)  # Strip HTML, dangerous URLs, zero-width chars
safe_text = render_llm(clean, source=user_content)

Pre-built policies: llm_safe, web_safe (alias), strict. Compose with |.


Migrate from mistune

Same API — swap the import:

from patitas import Markdown
md = Markdown()
html = md(source)

Full migration guide →


The Bengal Ecosystem

A structured reactive stack — every layer written in pure Python for 3.14t free-threading.

ᓚᘏᗢ Bengal Static site generator Docs
∿∿ Purr Content runtime
⌁⌁ Chirp Web framework Docs
=^..^= Pounce ASGI server Docs
)彡 Kida Template engine Docs
ฅᨐฅ Patitas Markdown parser ← You are here Docs
⌾⌾⌾ Rosettes Syntax highlighter Docs

Python-native. Free-threading ready. No npm required.


Development

git clone https://github.com/lbliii/patitas.git
cd patitas
uv sync --group dev
pytest

Run benchmarks (after uv sync --group dev):

python benchmarks/benchmark_vs_mistune.py
python benchmarks/benchmark_parallel.py   # Free-threading scaling demo
pytest benchmarks/benchmark_*.py -v --benchmark-only --benchmark-group-by=group   # Full suite

License

MIT License — see LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

patitas-0.3.5.tar.gz (221.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

patitas-0.3.5-py3-none-any.whl (214.8 kB view details)

Uploaded Python 3

File details

Details for the file patitas-0.3.5.tar.gz.

File metadata

  • Download URL: patitas-0.3.5.tar.gz
  • Upload date:
  • Size: 221.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for patitas-0.3.5.tar.gz
Algorithm Hash digest
SHA256 5944be9ef0fd2573852c785ee80039412cea02fdc1296abf9b5b1d27fe862eb1
MD5 7ea91c8eee240076618376f1ff8d69e9
BLAKE2b-256 a9f25515ccb29656ba4285dca0325fa581229f945815975ed9735eb5a96fa428

See more details on using hashes here.

Provenance

The following attestation bundles were made for patitas-0.3.5.tar.gz:

Publisher: python-publish.yml on lbliii/patitas

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file patitas-0.3.5-py3-none-any.whl.

File metadata

  • Download URL: patitas-0.3.5-py3-none-any.whl
  • Upload date:
  • Size: 214.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for patitas-0.3.5-py3-none-any.whl
Algorithm Hash digest
SHA256 da841c65f036ec159065bf5c302bf416ba95e1511b8243676e8ceb390205271b
MD5 6141d310b4f9faa25e356897bf51ac06
BLAKE2b-256 a500a73e99e95487add89d890ee3703c8143ee8e7caeccfab8cb90792d29709f

See more details on using hashes here.

Provenance

The following attestation bundles were made for patitas-0.3.5-py3-none-any.whl:

Publisher: python-publish.yml on lbliii/patitas

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page