Skip to main content

A library to extract sections from Markdown files based on headers.

Project description

Markdown Extract

A Simple Python library to parse Markdown files from headers.

PyPI Latest Release

Installation

Install via pip:

pip install markdown-extract

Usage

from markdown_extract import MarkdownExtractor

markdown_content = """
# Section 1
Some content here.

## Subsection 1.1
More details.
"""

extractor = MarkdownExtractor(markdown_content)

# 1. Access sections using dictionary-style brackets
print(extractor["Section 1"])
# Output:
# # Section 1
# Some content here.
# ...

# 2. Access nested sections
print(extractor["Section 1"]["Subsection 1.1"])
# Output:
# ## Subsection 1.1
# More details.

# 3. List child headers
print(extractor.list())
# Output: ['Section 1']

print(extractor["Section 1"].list())
# Output: ['Subsection 1.1']

# 4. Access the full document (root)
print(extractor[""])

Features

  • Nested Parsing: Correctly parses Markdown headers into a nested structure.
  • Robust Extraction: Ignores "headers" that are actually inside:
    • Code blocks (```)
    • Tables
    • Math blocks ($$)
    • YAML front matter (---)
  • Indentation Support: Handles indented headers correctly.
  • Easy Access: Use bracket notation (extractor["Header"]) or .get_section() method.
  • Discovery: Use .list() to see available child headers at any level.

Development

To run the tests:

python tests/run_tests.py

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

markdown_extract-0.1.1.tar.gz (6.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

markdown_extract-0.1.1-py3-none-any.whl (6.0 kB view details)

Uploaded Python 3

File details

Details for the file markdown_extract-0.1.1.tar.gz.

File metadata

  • Download URL: markdown_extract-0.1.1.tar.gz
  • Upload date:
  • Size: 6.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.6

File hashes

Hashes for markdown_extract-0.1.1.tar.gz
Algorithm Hash digest
SHA256 1031ea128988f24c5c832559292c88f275cbbd37ad2457cdd7f13b0b400f4e23
MD5 988acb302407f3c5fe51fd2bc12ae627
BLAKE2b-256 f2316b5e28abddcadcea3a0d32d54c36e67f6a7f9210c5350f365ded6a9129fc

See more details on using hashes here.

File details

Details for the file markdown_extract-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for markdown_extract-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 8a57497895e0ed85d3db39303a04aedf60babd5b06804619b277c17fb3e5fcf3
MD5 f85c68a90bfd4fa2932173368e1a1334
BLAKE2b-256 5064542c15f6ab20d2319b86125978250e4c53d3fcad0989ae4c96acc21b47fe

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page