Transform large markdown files into hierarchical folder structures for better navigation and AI-assisted editing
Project description
md-hierarchy
A CLI tool that splits markdown files into hierarchical folder structures based on heading levels, and can reconstruct the original markdown from the split pieces.
Features
- Split markdown files into navigable folder hierarchies
- Merge folder structures back into single markdown files
- Preserves all markdown elements (code blocks, lists, tables, links, etc.)
- Handles edge cases (duplicate headings, empty headings, skipped levels)
- Round-trip compatible (split → merge produces equivalent content)
- Dry-run mode to preview operations
Installation
# From PyPI
pip install md-hierarchy
# From source
pip install -e .
# With development dependencies
pip install -e ".[dev]"
Usage
Split Command
Split a markdown file into a hierarchical folder structure:
md-hierarchy split input.md output_dir --level 3
Options:
--level, -l: Heading level to extract as files (1-4, default: 3)--overwrite: Overwrite output directory if it exists--verbose, -v: Print detailed operation log--dry-run: Show what would be done without writing files
Example:
# Split at level 3 (H3 headings become files)
md-hierarchy split proposal.md ./output --level 3
# Split with overwrite
md-hierarchy split proposal.md ./output --level 2 --overwrite
# Preview without creating files
md-hierarchy split proposal.md ./output --dry-run
Merge Command
Merge a folder structure back into a single markdown file:
md-hierarchy merge input_dir output.md
Options:
--verbose, -v: Print detailed operation log
Example:
# Merge folder structure
md-hierarchy merge ./output merged.md
# Merge with verbose output
md-hierarchy merge ./split-docs final.md --verbose
Output Structure
When splitting at level 3, the tool creates this structure:
output-dir/
├── 00-__frontmatter__.md # Content before first heading (if exists)
├── 01-Introduction/
│ ├── 00-__intro__.md # H1 heading + intro content (always created)
│ ├── 01-Background/
│ │ ├── 00-__intro__.md # H2 heading + intro content (always created)
│ │ ├── 01-Problem-Statement.md # H3 section
│ │ └── 02-Research-Gap.md # H3 section
│ └── 02-Objectives/
│ ├── 00-__intro__.md # H2 heading (even if no intro content)
│ └── 01-Primary-Goals.md
└── 02-Methodology/
└── 00-__intro__.md # H1 heading + content
File Naming Convention
- Folders:
NN-Sanitized-Title/(e.g.,01-Introduction/) - Intro files:
00-__intro__.md(always created for every heading folder) - Frontmatter:
00-__frontmatter__.md(at root, only if content exists before first heading) - Section files:
NN-Sanitized-Title.md(e.g.,01-Problem-Statement.md) - Numbers are zero-padded (01, 02, ..., 99)
- Special characters (
/ \ : * ? " < > |) are removed - Spaces are replaced with hyphens
- Maximum length: 50 characters
Key Design Decisions
00-__intro__.mdis always created for every heading folder, even if empty- This provides a consistent structure and an easy place to add intro text later
- Contains the heading declaration and any content before child sections
- The
00-prefix ensures intro files sort first in directory listings - The
__intro__naming (double underscore) clearly marks these as special/meta files - Frontmatter files are created at the root only when pre-heading content exists
Edge Cases Handled
- Empty headings →
Untitled-Section-N - Duplicate titles → Append
-2,-3, etc. - Skipped levels (H1 → H3) → Insert
00-Content/folder - Content before first heading →
00-__frontmatter__.mdat root - Heading attributes (e.g.,
{#id .class}) → Preserved in content - Headings with no intro content →
00-__intro__.mdstill created (with just the heading)
Round-Trip Compatibility
The tool is designed for round-trip operations:
# Split
md-hierarchy split original.md ./split --level 3
# Merge
md-hierarchy merge ./split reconstructed.md
# Content should be equivalent
diff original.md reconstructed.md
Development
Setup
# Create virtual environment
python -m venv venv
source venv/bin/activate # or `venv\Scripts\activate` on Windows
# Install in development mode
pip install -e ".[dev]"
Run Tests
pytest
Run Tests with Coverage
pytest --cov=md_hierarchy --cov-report=html
Requirements
- Python 3.8+
- Dependencies:
markdown-it-py- Markdown parsingclick- CLI framework
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file md_hierarchy-0.2.0.tar.gz.
File metadata
- Download URL: md_hierarchy-0.2.0.tar.gz
- Upload date:
- Size: 17.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
20acb67832d8f23691001c9b6b7d1751cc4b9f856b68688640407a0f089135c4
|
|
| MD5 |
9cc573c6e0642d5980e99b4de7e69ee2
|
|
| BLAKE2b-256 |
b9a33b14fd875f6727968a07baf881850e205356fc560ca84ef91e5aa5332b47
|
File details
Details for the file md_hierarchy-0.2.0-py3-none-any.whl.
File metadata
- Download URL: md_hierarchy-0.2.0-py3-none-any.whl
- Upload date:
- Size: 15.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5b4686c8420f61a315e7f23cd59c5eaabbf402151d1a055b678dc099df8b0d6d
|
|
| MD5 |
2338ec0f5a0817c66fa8cbaf95720f34
|
|
| BLAKE2b-256 |
4af99433b67dbd2a6195384ea03ca1668391d708e3b0a44241aa727e8d45ce15
|