A Python package for converting Markdown to AST and back to Markdown

These details have not been verified by PyPI

Project links

Project description

marktripy

TL;DR: A Python package for parsing Markdown to AST, manipulating the tree structure, and serializing back to Markdown while preserving formatting. Built on markdown-it-py and mistletoe for maximum flexibility.

from marktripy import parse_markdown, render_markdown

# Parse Markdown to AST
ast = parse_markdown("# Hello\n\nThis is **bold** text.")

# Manipulate AST (e.g., downgrade headings)
for node in ast.walk():
    if node.type == "heading":
        node.level += 1

# Render back to Markdown
markdown = render_markdown(ast)
# Output: "## Hello\n\nThis is **bold** text."

Installation

# Using pip
pip install marktripy

# Using uv (recommended)
uv add marktripy

# Development installation
git clone https://github.com/yourusername/marktripy
cd marktripy
uv sync --dev

Quick Usage

Basic Markdown to HTML

from marktripy import markdown_to_html

html = markdown_to_html("# Hello World\n\nThis is **bold** and *italic*.")
# <h1>Hello World</h1><p>This is <strong>bold</strong> and <em>italic</em>.</p>

AST Manipulation

from marktripy import parse_markdown, render_markdown

# Parse Markdown to AST
ast = parse_markdown("""
# Main Title
## Section 1
Some content here.
## Section 2
More content.
""")

# Add IDs to all headings
for node in ast.walk():
    if node.type == "heading":
        # Generate ID from heading text
        text = node.get_text().lower().replace(" ", "-")
        node.attrs["id"] = text
        
# Downgrade all headings by one level
for node in ast.walk():
    if node.type == "heading" and node.level < 6:
        node.level += 1

# Render back to Markdown
result = render_markdown(ast)

Custom Syntax Extensions

from marktripy import create_extension, Parser

# Create a custom extension for ++text++ → <kbd>text</kbd>
kbd_extension = create_extension(
    pattern=r'\+\+([^+]+)\+\+',
    node_type='kbd',
    html_tag='kbd'
)

# Use parser with extension
parser = Parser(extensions=[kbd_extension])
ast = parser.parse("Press ++Ctrl+C++ to copy")
html = parser.render_html(ast)
# Output: Press <kbd>Ctrl+C</kbd> to copy

CLI Usage

# Convert Markdown to HTML
marktripy convert input.md -o output.html

# Parse and manipulate Markdown
marktripy transform input.md --downgrade-headings --add-ids -o output.md

# Validate Markdown structure
marktripy validate document.md --check-links --check-headings

The Backstory

Why Another Markdown Parser?

The Python ecosystem has numerous Markdown parsers, each with different strengths:

markdown: The original, extensible but with a complex API
markdown2: Faster alternative but less extensible
mistune: Fast and supports AST, but limited round-trip capability
marko: Good AST support but newer with less ecosystem
markdown-it-py: Port of markdown-it with excellent plugin system

After extensive research (see /ref directory), I found that no single library perfectly addressed the need for:

Clean AST manipulation - Easy traversal and modification of document structure
Round-trip conversion - Parse Markdown → AST → Markdown without losing formatting
Extensibility - Simple API for adding custom syntax
Performance - Fast enough for real-world documents
Standards compliance - CommonMark compliant with GFM extensions

The Research Journey

The /ref directory contains comprehensive research comparing 8+ Python Markdown libraries across multiple dimensions:

ref1.md: Practical guide to advanced Markdown processing in Python
ref2.md: Detailed comparison of parser architectures and extension mechanisms
ref3.md: Performance benchmarks and feature matrix

Key findings:

markdown-it-py offers the best plugin architecture
mistletoe has the cleanest AST representation
marko provides good round-trip capabilities
Performance varies by 10-100x between libraries

Design Philosophy

marktripy combines the best ideas from existing libraries:

Dual-parser architecture: Use markdown-it-py for extensibility and mistletoe for AST manipulation
Unified AST format: Convert between parser representations transparently
Preserving formatting: Track source positions and whitespace for faithful round-trips
Plugin-first design: Everything beyond core CommonMark is a plugin
Type safety: Full type hints with mypy --strict compatibility

Technical Architecture

Core Components

marktripy/
├── ast.py          # Unified AST node definitions
├── parser.py       # Parser abstraction layer
├── renderer.py     # Markdown/HTML renderers
├── extensions/     # Built-in extensions
│   ├── gfm.py     # GitHub Flavored Markdown
│   ├── toc.py     # Table of contents generator
│   └── ...
├── transformers/   # AST transformation utilities
│   ├── headings.py # Heading manipulation
│   ├── links.py    # Link processing
│   └── ...
└── cli.py         # Command-line interface

AST Structure

The AST uses a unified node structure compatible with both parsers:

class ASTNode:
    type: str           # Node type (heading, paragraph, etc.)
    children: List[ASTNode]
    attrs: Dict[str, Any]   # Attributes (id, class, etc.)
    content: str        # Text content for leaf nodes
    meta: Dict[str, Any]    # Source mapping, parser-specific data

Parser Architecture

# Abstraction layer over multiple parsers
class Parser:
    def __init__(self, parser_backend="markdown-it-py", extensions=None):
        self.backend = self._create_backend(parser_backend)
        self.extensions = extensions or []
        
    def parse(self, markdown: str) -> ASTNode:
        # Parse with backend
        backend_ast = self.backend.parse(markdown)
        # Convert to unified AST
        return self._normalize_ast(backend_ast)

Extension System

Extensions can hook into multiple stages:

class Extension:
    def extend_parser(self, parser): ...      # Modify parser rules
    def transform_ast(self, ast): ...         # Post-process AST
    def extend_renderer(self, renderer): ...  # Custom rendering

Rendering Pipeline

AST → Markdown: Preserves formatting, handles custom nodes
AST → HTML: Configurable sanitization, custom handlers
AST → JSON: Serialization for processing pipelines

Performance Optimizations

Lazy parsing for large documents
Streaming renderers for memory efficiency
Optional C extensions via umarkdown backend
Caching for repeated transformations

Advanced Usage

Custom Transformers

from marktripy import Transformer

class HeaderAnchorTransformer(Transformer):
    """Add GitHub-style anchor links to headers"""
    
    def transform(self, ast):
        for node in ast.walk():
            if node.type == "heading":
                anchor = self.create_anchor(node)
                node.children.insert(0, anchor)
        return ast

Parser Backends

# Use different backends for different needs
from marktripy import Parser

# Maximum compatibility
parser = Parser(backend="markdown")

# Best performance  
parser = Parser(backend="mistletoe")

# Most extensions
parser = Parser(backend="markdown-it-py")

Integration Examples

# Pelican static site generator
from marktripy import PelicanReader

# MkDocs documentation
from marktripy import MkDocsPlugin

# Jupyter notebook processing
from marktripy import MarkdownCell

Contributing

We welcome contributions! Key areas:

Additional extensions (math, diagrams, etc.)
Performance improvements
Better round-trip fidelity
More transformer utilities

See CONTRIBUTING.md for guidelines.

License

MIT License - see LICENSE for details.

Acknowledgments

Built on the shoulders of giants:

markdown-it-py developers for the excellent plugin system
mistletoe for the clean AST design
The CommonMark specification authors
All researchers of the Python Markdown ecosystem

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

1.0.3

Jul 29, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

marktripy-1.0.3.tar.gz (46.1 kB view details)

Uploaded Jul 29, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

marktripy-1.0.3-py3-none-any.whl (47.6 kB view details)

Uploaded Jul 29, 2025 Python 3

File details

Details for the file marktripy-1.0.3.tar.gz.

File metadata

Download URL: marktripy-1.0.3.tar.gz
Upload date: Jul 29, 2025
Size: 46.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: python-httpx/0.28.1

File hashes

Hashes for marktripy-1.0.3.tar.gz
Algorithm	Hash digest
SHA256	`7d760b44aac7a528d6e045f22592670c013e3a29f19b93a902c5c72991477c3a`
MD5	`91c36d339ca1af77177b14f07717bd1e`
BLAKE2b-256	`bf96779a97a53739a4596e0a56a08de040799d15e79d93e994d1ce544025b065`

See more details on using hashes here.

File details

Details for the file marktripy-1.0.3-py3-none-any.whl.

File metadata

Download URL: marktripy-1.0.3-py3-none-any.whl
Upload date: Jul 29, 2025
Size: 47.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: python-httpx/0.28.1

File hashes

Hashes for marktripy-1.0.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`ffadcec11db94031d9673761d43c151fe65004692f823811640ec87fe54cb965`
MD5	`1dabd6f4b7609e6d81d5de2a58e77a76`
BLAKE2b-256	`9a07a241d76967401f8044f56347839b0c3770b3b18c06dd168167fd46800f49`

See more details on using hashes here.

marktripy 1.0.3

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

marktripy

Installation

Quick Usage

Basic Markdown to HTML

AST Manipulation

Custom Syntax Extensions

CLI Usage

The Backstory

Why Another Markdown Parser?

The Research Journey

Design Philosophy

Technical Architecture

Core Components

AST Structure

Parser Architecture

Extension System

Rendering Pipeline

Performance Optimizations

Advanced Usage

Custom Transformers

Parser Backends

Integration Examples

Contributing

License

Acknowledgments

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes