Skip to main content

Parse, Manipulate, and Merge Markdown Heading Trees Programmatically.

Project description

markdown-parser-py

Turn raw Markdown into a manipulable heading tree, edit it programmatically, then emit valid Markdown again.

status python

✨ Features

  • Parse Markdown into a hierarchical tree of headings (levels 1–6)
  • Preserve and round‑trip section body content
  • Query sections via simple dot paths (e.g. Introduction.Installation.Windows)
  • Add / remove sections dynamically
  • Attach (merge) whole subtrees across different Markdown documents with automatic heading level adjustment
  • Dump back to Markdown or visualize structure in a tree-like ASCII output

📦 Installation

pip install markdown-parser-py

or, for an editable install

git clone https://github.com/VarunGumma/markdown-parser-py
cd markdown-parser-py
pip install -e ./

🧠 Core Concepts

The model is minimal:

MarkdownTree
└── root (MarkdownNode level=0, title="ROOT")
	├── Child heading (level=1 => '#')
	│   └── Grandchild (level=2 => '##')
	└── ...

Each MarkdownNode stores:

  • level: 0 for synthetic root; 1–6 for real headings
  • title: heading text
  • content: list of raw paragraph / code / list text blocks under that heading (excluding child headings)
  • children: nested headings

🚀 Quick Start

from markdown_parser import MarkdownTree

doc = """
# Intro
Some intro text.

## Install
Run `pip install x`.

## Usage
Basic usage here.

### CLI
Run `tool`.
"""

tree = MarkdownTree()
tree.parse(doc)

print('\n=== Visualize ===')
tree.visualize()

print('\n=== Dump Round Trip ===')
print(tree.dump())

Output (visualize):

└── # Intro
	├── ## Install
	└── ## Usage
		└── ### CLI

🔍 Finding Sections

node = tree.find_node_by_path('Intro.Install')  # '# Intro' > '## Install'
if node:
	print('Found:', node.title, 'level', node.level)

Dot paths walk downward by titles. A single component path refers to a top‑level heading (level 1). Returns None if not found.

➕ Adding Sections

new = tree.add_section('Intro', 'Advanced', content='Deep dive coming soon.')
print('Added at level', new.level)

If parent_path is "" or "ROOT", the new section becomes a top‑level heading.

➖ Removing Sections

tree.remove_section('Intro.Advanced')  # removes that subtree

🔗 Attaching / Merging Subtrees

You can merge content from another parsed Markdown document. Levels auto-adjust so the attached subtree root sits exactly one level below the chosen parent.

from markdown_parser import MarkdownTree

base = MarkdownTree()
base.parse('# A\nIntro text.')

other = MarkdownTree()
other.parse('# Extra\nStuff here.\n\n## Deep\nDetails.')

# Attach ALL top-level sections from other under 'A'
base.attach_subtree('A', other)  # Equivalent to source_path=None

# Or attach only a specific subsection
# base.attach_subtree('A', other, source_path='Extra.Deep')

base.visualize()
print(base.dump())

If you attach the full tree (source_path=None / 'ROOT'), each top-level section in the source is cloned with level adjusted: new_level = parent.level + original_level.

🧪 Advanced Example: Composing Documents

def compose(product_readme: str, appendix_md: str) -> str:
	main_tree = MarkdownTree()
	main_tree.parse(product_readme)

	appendix_tree = MarkdownTree()
	appendix_tree.parse(appendix_md)

	# Ensure an Appendix section exists
	if not main_tree.find_node_by_path('Appendix'):
		main_tree.add_section('', 'Appendix')

	# Attach all appendix top-level sections under Appendix
	main_tree.attach_subtree('Appendix', appendix_tree)
	return main_tree.dump()

📝 Disclaimer

This is an early/experimental utility. Edge cases (nested fenced code blocks, Setext headings, ATX heading oddities, HTML blocks) are not fully supported yet.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

markdown_parser_py-1.0.1.tar.gz (6.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

markdown_parser_py-1.0.1-py3-none-any.whl (7.0 kB view details)

Uploaded Python 3

File details

Details for the file markdown_parser_py-1.0.1.tar.gz.

File metadata

  • Download URL: markdown_parser_py-1.0.1.tar.gz
  • Upload date:
  • Size: 6.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for markdown_parser_py-1.0.1.tar.gz
Algorithm Hash digest
SHA256 ed05765c8ba9d9459280aba7a6af6314fdadeb814e0258b17809da29dd4f957e
MD5 7923918cb89fc00abdc3c21bc00340dd
BLAKE2b-256 3929d680c74bf64dc510ffd83c0321db569cf17f4a752cb086a47e18401bc6a1

See more details on using hashes here.

Provenance

The following attestation bundles were made for markdown_parser_py-1.0.1.tar.gz:

Publisher: publish.yml on VarunGumma/markdown-parser-py

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file markdown_parser_py-1.0.1-py3-none-any.whl.

File metadata

File hashes

Hashes for markdown_parser_py-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 687792834352afc1fe9dc37c2ad20f7182612159e5c5222a34261093686e1d3a
MD5 b685af4402ef02e117ba34aa1de6be81
BLAKE2b-256 9dec23a5d686200344695258d15fa12be9e3e87004dfb80d62b0f1515546278f

See more details on using hashes here.

Provenance

The following attestation bundles were made for markdown_parser_py-1.0.1-py3-none-any.whl:

Publisher: publish.yml on VarunGumma/markdown-parser-py

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page