Skip to main content

Transform large markdown files into hierarchical folder structures for better navigation and AI-assisted editing

Project description

md-hierarchy logo

md-hierarchy

A CLI tool that splits markdown files into hierarchical folder structures based on heading levels, and can reconstruct the original markdown from the split pieces.

Features

  • Split markdown files into navigable folder hierarchies
  • Merge folder structures back into single markdown files
  • Preserves all markdown elements (code blocks, lists, tables, links, etc.)
  • Handles edge cases (duplicate headings, empty headings, skipped levels)
  • Round-trip compatible (split → merge produces equivalent content)
  • Dry-run mode to preview operations

Installation

# From PyPI
pip install md-hierarchy

# From source
pip install -e .

# With development dependencies
pip install -e ".[dev]"

Usage

Split Command

Split a markdown file into a hierarchical folder structure:

md-hierarchy split input.md output_dir --level 3

Options:

  • --level, -l: Heading level to extract as files (1-4, default: 3)
  • --overwrite: Overwrite output directory if it exists
  • --verbose, -v: Print detailed operation log
  • --dry-run: Show what would be done without writing files

Example:

# Split at level 3 (H3 headings become files)
md-hierarchy split proposal.md ./output --level 3

# Split with overwrite
md-hierarchy split proposal.md ./output --level 2 --overwrite

# Preview without creating files
md-hierarchy split proposal.md ./output --dry-run

Merge Command

Merge a folder structure back into a single markdown file:

md-hierarchy merge input_dir output.md

Options:

  • --verbose, -v: Print detailed operation log

Example:

# Merge folder structure
md-hierarchy merge ./output merged.md

# Merge with verbose output
md-hierarchy merge ./split-docs final.md --verbose

Output Structure

When splitting at level 3, the tool creates this structure:

output-dir/
├── 00-__frontmatter__.md            # Content before first heading (if exists)
├── 01-Introduction/
│   ├── 00-__intro__.md              # H1 heading + intro content (always created)
│   ├── 01-Background/
│   │   ├── 00-__intro__.md          # H2 heading + intro content (always created)
│   │   ├── 01-Problem-Statement.md  # H3 section
│   │   └── 02-Research-Gap.md       # H3 section
│   └── 02-Objectives/
│       ├── 00-__intro__.md          # H2 heading (even if no intro content)
│       └── 01-Primary-Goals.md
└── 02-Methodology/
    └── 00-__intro__.md              # H1 heading + content

File Naming Convention

  • Folders: NN-Sanitized-Title/ (e.g., 01-Introduction/)
  • Intro files: 00-__intro__.md (always created for every heading folder)
  • Frontmatter: 00-__frontmatter__.md (at root, only if content exists before first heading)
  • Section files: NN-Sanitized-Title.md (e.g., 01-Problem-Statement.md)
  • Numbers are zero-padded (01, 02, ..., 99)
  • Special characters (/ \ : * ? " < > |) are removed
  • Spaces are replaced with hyphens
  • Maximum length: 50 characters

Key Design Decisions

  • 00-__intro__.md is always created for every heading folder, even if empty
    • This provides a consistent structure and an easy place to add intro text later
    • Contains the heading declaration and any content before child sections
  • The 00- prefix ensures intro files sort first in directory listings
  • The __intro__ naming (double underscore) clearly marks these as special/meta files
  • Frontmatter files are created at the root only when pre-heading content exists

Edge Cases Handled

  1. Empty headingsUntitled-Section-N
  2. Duplicate titles → Append -2, -3, etc.
  3. Skipped levels (H1 → H3) → Insert 00-Content/ folder
  4. Content before first heading00-__frontmatter__.md at root
  5. Heading attributes (e.g., {#id .class}) → Preserved in content
  6. Headings with no intro content00-__intro__.md still created (with just the heading)

Round-Trip Compatibility

The tool is designed for round-trip operations:

# Split
md-hierarchy split original.md ./split --level 3

# Merge
md-hierarchy merge ./split reconstructed.md

# Content should be equivalent
diff original.md reconstructed.md

Development

Setup

# Create virtual environment
python -m venv venv
source venv/bin/activate  # or `venv\Scripts\activate` on Windows

# Install in development mode
pip install -e ".[dev]"

Run Tests

pytest

Run Tests with Coverage

pytest --cov=md_hierarchy --cov-report=html

Requirements

  • Python 3.8+
  • Dependencies:
    • markdown-it-py - Markdown parsing
    • click - CLI framework

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

md_hierarchy-0.2.0.tar.gz (17.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

md_hierarchy-0.2.0-py3-none-any.whl (15.4 kB view details)

Uploaded Python 3

File details

Details for the file md_hierarchy-0.2.0.tar.gz.

File metadata

  • Download URL: md_hierarchy-0.2.0.tar.gz
  • Upload date:
  • Size: 17.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for md_hierarchy-0.2.0.tar.gz
Algorithm Hash digest
SHA256 20acb67832d8f23691001c9b6b7d1751cc4b9f856b68688640407a0f089135c4
MD5 9cc573c6e0642d5980e99b4de7e69ee2
BLAKE2b-256 b9a33b14fd875f6727968a07baf881850e205356fc560ca84ef91e5aa5332b47

See more details on using hashes here.

File details

Details for the file md_hierarchy-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: md_hierarchy-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 15.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for md_hierarchy-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 5b4686c8420f61a315e7f23cd59c5eaabbf402151d1a055b678dc099df8b0d6d
MD5 2338ec0f5a0817c66fa8cbaf95720f34
BLAKE2b-256 4af99433b67dbd2a6195384ea03ca1668391d708e3b0a44241aa727e8d45ce15

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page