Skip to main content

Flask extension to serve HTML pages as Markdown via ?format=md query parameter.

Project description

canonicalwebteam.markdown-response

Flask extension that adds ?format=md support to all HTML responses, converting pages to clean Markdown with YAML frontmatter. Designed for LLM and crawler optimization.

Installation

pip install canonicalwebteam.markdown-response

Usage

from canonicalwebteam.markdown_response import MarkdownResponse

app = Flask(__name__)
MarkdownResponse(app)

Or with the application factory pattern:

md = MarkdownResponse()
md.init_app(app)

Any page can now be accessed as Markdown by appending ?format=md to the URL.

Configuration

MarkdownResponse(app,
    content_selector="#main-content",  # CSS selector for content extraction
    strip_elements=["script", "style", "nav", "noscript"],  # Tags to remove
    strip_classes=["u-hide", "u-off-screen"],  # Classes to remove
    query_param="format",  # Query parameter name
    query_value="md",  # Query parameter value
)

Template-level exclusion

Add data-md-strip to any HTML element to exclude it from the Markdown output:

<section data-md-strip>
    <form>This form won't appear in markdown output</form>
</section>

How it works

  1. An after_request handler intercepts responses when ?format=md is present
  2. Only processes HTML 200 responses (JSON, XML, errors pass through)
  3. Extracts the content area using BeautifulSoup (#main-content by default)
  4. Strips unwanted elements (scripts, styles, nav, hidden elements, data-md-strip)
  5. Converts remaining HTML to Markdown via markdownify
  6. Prepends YAML frontmatter extracted from <head> meta tags
  7. Returns with Content-Type: text/markdown; charset=utf-8

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

canonicalwebteam_markdown_response-0.2.0.tar.gz (7.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

File details

Details for the file canonicalwebteam_markdown_response-0.2.0.tar.gz.

File metadata

File hashes

Hashes for canonicalwebteam_markdown_response-0.2.0.tar.gz
Algorithm Hash digest
SHA256 207c85b48c624e2b5f6fefaee3ed8bb5b879d109869629916d51d219ed9a438a
MD5 773fd51d537d8e507cd2c4fc4cb72cbe
BLAKE2b-256 92e19d787609f4020a9ac9ca255f50e549e75e2ab0dc95ae930d2267b218b9d1

See more details on using hashes here.

Provenance

The following attestation bundles were made for canonicalwebteam_markdown_response-0.2.0.tar.gz:

Publisher: publish.yaml on canonical/canonicalwebteam.markdown-response

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file canonicalwebteam_markdown_response-0.2.0-py3-none-any.whl.

File metadata

File hashes

Hashes for canonicalwebteam_markdown_response-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 938a8aaf746bee449679196bbaade1233c3af2034c987eb65154540dddd11b28
MD5 608c7427b872fcce589c5f8afad6a9f8
BLAKE2b-256 086fa80f507163a781b4cc5f9481ee57cd51b3893bde8a592750019689ce0661

See more details on using hashes here.

Provenance

The following attestation bundles were made for canonicalwebteam_markdown_response-0.2.0-py3-none-any.whl:

Publisher: publish.yaml on canonical/canonicalwebteam.markdown-response

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page