Parse, Manipulate, and Merge Markdown Heading Trees Programmatically.
Project description
markdown-parser-py
Turn raw Markdown into a manipulable heading tree, edit it programmatically, then emit valid Markdown again.
✨ Features
- Parse Markdown into a hierarchical tree of headings (levels 1–6)
- Preserve and round‑trip section body content
- Query sections via simple dot paths (e.g.
Introduction.Installation.Windows) - Add / remove sections dynamically
- Attach (merge) whole subtrees across different Markdown documents with automatic heading level adjustment
- Dump back to Markdown or visualize structure in a
tree-like ASCII output
📦 Installation
pip install markdown-parser-py
or, for an editable install
git clone https://github.com/VarunGumma/markdown-parser-py
cd markdown-parser-py
pip install -e ./
🧠 Core Concepts
The model is minimal:
MarkdownTree
└── root (MarkdownNode level=0, title="ROOT")
├── Child heading (level=1 => '#')
│ └── Grandchild (level=2 => '##')
└── ...
Each MarkdownNode stores:
level: 0 for synthetic root; 1–6 for real headingstitle: heading textcontent: list of raw paragraph / code / list text blocks under that heading (excluding child headings)children: nested headings
🚀 Quick Start
from markdown_parser import MarkdownTree
doc = """
# Intro
Some intro text.
## Install
Run `pip install x`.
## Usage
Basic usage here.
### CLI
Run `tool`.
"""
tree = MarkdownTree()
tree.parse(doc)
print('\n=== Visualize ===')
tree.visualize()
print('\n=== Dump Round Trip ===')
print(tree.dump())
Output (visualize):
└── # Intro
├── ## Install
└── ## Usage
└── ### CLI
🔍 Finding Sections
node = tree.find_node_by_path('Intro.Install') # '# Intro' > '## Install'
if node:
print('Found:', node.title, 'level', node.level)
Dot paths walk downward by titles. A single component path refers to a top‑level heading (level 1). Returns None if not found.
➕ Adding Sections
new = tree.add_section('Intro', 'Advanced', content='Deep dive coming soon.')
print('Added at level', new.level)
If parent_path is "" or "ROOT", the new section becomes a top‑level heading.
➖ Removing Sections
tree.remove_section('Intro.Advanced') # removes that subtree
🔗 Attaching / Merging Subtrees
You can merge content from another parsed Markdown document. Levels auto-adjust so the attached subtree root sits exactly one level below the chosen parent.
from markdown_parser import MarkdownTree
base = MarkdownTree()
base.parse('# A\nIntro text.')
other = MarkdownTree()
other.parse('# Extra\nStuff here.\n\n## Deep\nDetails.')
# Attach ALL top-level sections from other under 'A'
base.attach_subtree('A', other) # Equivalent to source_path=None
# Or attach only a specific subsection
# base.attach_subtree('A', other, source_path='Extra.Deep')
base.visualize()
print(base.dump())
If you attach the full tree (source_path=None / 'ROOT'), each top-level section in the source is cloned with level adjusted: new_level = parent.level + original_level.
🧪 Advanced Example: Composing Documents
def compose(product_readme: str, appendix_md: str) -> str:
main_tree = MarkdownTree()
main_tree.parse(product_readme)
appendix_tree = MarkdownTree()
appendix_tree.parse(appendix_md)
# Ensure an Appendix section exists
if not main_tree.find_node_by_path('Appendix'):
main_tree.add_section('', 'Appendix')
# Attach all appendix top-level sections under Appendix
main_tree.attach_subtree('Appendix', appendix_tree)
return main_tree.dump()
📝 Disclaimer
This is an early/experimental utility. Edge cases (nested fenced code blocks, Setext headings, ATX heading oddities, HTML blocks) are not fully supported yet.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file markdown_parser_py-1.0.1.tar.gz.
File metadata
- Download URL: markdown_parser_py-1.0.1.tar.gz
- Upload date:
- Size: 6.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ed05765c8ba9d9459280aba7a6af6314fdadeb814e0258b17809da29dd4f957e
|
|
| MD5 |
7923918cb89fc00abdc3c21bc00340dd
|
|
| BLAKE2b-256 |
3929d680c74bf64dc510ffd83c0321db569cf17f4a752cb086a47e18401bc6a1
|
Provenance
The following attestation bundles were made for markdown_parser_py-1.0.1.tar.gz:
Publisher:
publish.yml on VarunGumma/markdown-parser-py
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
markdown_parser_py-1.0.1.tar.gz -
Subject digest:
ed05765c8ba9d9459280aba7a6af6314fdadeb814e0258b17809da29dd4f957e - Sigstore transparency entry: 563674898
- Sigstore integration time:
-
Permalink:
VarunGumma/markdown-parser-py@ca81b8e1b4d6e4b5da6eaedbf15454752c00c5c7 -
Branch / Tag:
refs/tags/v1.0.1.post1 - Owner: https://github.com/VarunGumma
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@ca81b8e1b4d6e4b5da6eaedbf15454752c00c5c7 -
Trigger Event:
release
-
Statement type:
File details
Details for the file markdown_parser_py-1.0.1-py3-none-any.whl.
File metadata
- Download URL: markdown_parser_py-1.0.1-py3-none-any.whl
- Upload date:
- Size: 7.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
687792834352afc1fe9dc37c2ad20f7182612159e5c5222a34261093686e1d3a
|
|
| MD5 |
b685af4402ef02e117ba34aa1de6be81
|
|
| BLAKE2b-256 |
9dec23a5d686200344695258d15fa12be9e3e87004dfb80d62b0f1515546278f
|
Provenance
The following attestation bundles were made for markdown_parser_py-1.0.1-py3-none-any.whl:
Publisher:
publish.yml on VarunGumma/markdown-parser-py
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
markdown_parser_py-1.0.1-py3-none-any.whl -
Subject digest:
687792834352afc1fe9dc37c2ad20f7182612159e5c5222a34261093686e1d3a - Sigstore transparency entry: 563674899
- Sigstore integration time:
-
Permalink:
VarunGumma/markdown-parser-py@ca81b8e1b4d6e4b5da6eaedbf15454752c00c5c7 -
Branch / Tag:
refs/tags/v1.0.1.post1 - Owner: https://github.com/VarunGumma
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@ca81b8e1b4d6e4b5da6eaedbf15454752c00c5c7 -
Trigger Event:
release
-
Statement type: