Skip to main content

Lightweight HTML-to-Markdown tooling for agent workflows.

Project description

markmaton

CI Release PyPI version Python versions

markmaton is a lightweight HTML-to-Markdown parser for agent workflows. It takes already-fetched page HTML, cleans the structure, and returns Markdown plus page metadata.

[!NOTE] markmaton is a general parser, not a crawler. Feed it HTML from Playwright, fetch, Firecrawl, or another upstream page-visit tool.

Why it exists

  • Keep the parser core narrow and deterministic.
  • Accept both fetched HTML and rendered HTML.
  • Make HTML-to-Markdown robust enough for real agent workflows.
  • Ship a simple Python CLI around a Go engine.

Install

pip

pip install markmaton

uv tool

uv tool install markmaton

[!TIP] markmaton itself now develops as a uv-managed Python 3.12 project. The installed package still works through plain pip, but local development assumes uv.

Quickstart

CLI

markmaton convert \
  --html-file page.html \
  --url https://example.com/article \
  --output-format markdown

To get the full structured response:

markmaton convert \
  --html-file page.html \
  --url https://example.com/article \
  --output-format json

Python API

from markmaton import ConvertOptions, ConvertRequest, convert_html

html = "<article><h1>Hello</h1><p>World</p></article>"

response = convert_html(
    ConvertRequest(
        html=html,
        url="https://example.com/article",
        options=ConvertOptions(only_main_content=True),
    )
)

print(response.markdown)
print(response.metadata.title)

[!TIP] Pass url whenever you can. markmaton uses it as parsing context for canonical metadata and absolute link resolution.

What you get back

The JSON response includes:

  • markdown
  • html_clean
  • metadata
  • links
  • images
  • quality

This keeps the parser useful both as a Markdown generator and as a page-normalization step in a larger workflow.

Project shape

  • Go engine: cmd/markmaton-engine
  • Python wrapper and CLI: markmaton/
  • Parser fixtures and golden files: testdata/
  • Research, benchmark, and release docs: docs/

Documentation

Start here:

Development

Set up the local development environment:

uv sync --group dev

Run the core test suites:

uv run python -m unittest discover -s tests -p 'test_*.py'
go test ./...

For a manual end-to-end smoke:

The repo is pinned to:

[!IMPORTANT] Automated coverage stays unit-test-first. Live page visits and benchmark sampling are intentionally kept out of the default automated test path.

Release notes

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

markmaton-0.1.5.tar.gz (356.0 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

markmaton-0.1.5-py3-none-win_amd64.whl (3.9 MB view details)

Uploaded Python 3Windows x86-64

markmaton-0.1.5-py3-none-manylinux2014_x86_64.whl (3.9 MB view details)

Uploaded Python 3

markmaton-0.1.5-py3-none-macosx_12_0_x86_64.whl (4.0 MB view details)

Uploaded Python 3macOS 12.0+ x86-64

markmaton-0.1.5-py3-none-macosx_12_0_arm64.whl (3.8 MB view details)

Uploaded Python 3macOS 12.0+ ARM64

File details

Details for the file markmaton-0.1.5.tar.gz.

File metadata

  • Download URL: markmaton-0.1.5.tar.gz
  • Upload date:
  • Size: 356.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for markmaton-0.1.5.tar.gz
Algorithm Hash digest
SHA256 98a7b1006978927363821fd4baad2c51f4a38f94a0b9813334ca84d1c93bb777
MD5 c7de07eeee1e73e12d1a2c47bfdac3e8
BLAKE2b-256 6c1939e71e595d05a2da0350dcd9064e2b4a0b83820d966c4fc5dcdf02fc342a

See more details on using hashes here.

Provenance

The following attestation bundles were made for markmaton-0.1.5.tar.gz:

Publisher: workflow.yml on appautomaton/markmaton

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file markmaton-0.1.5-py3-none-win_amd64.whl.

File metadata

  • Download URL: markmaton-0.1.5-py3-none-win_amd64.whl
  • Upload date:
  • Size: 3.9 MB
  • Tags: Python 3, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for markmaton-0.1.5-py3-none-win_amd64.whl
Algorithm Hash digest
SHA256 aeb4fb28cb16f6a226dbc6db53b0949c78998a06c83563705f863b5460526957
MD5 65b9dbc786fa3a3d376c0d6cbd741782
BLAKE2b-256 6f43b96ce162570e4faee3f822b4309872f9820a2d5e2b641a4d6f0cc3d40cfa

See more details on using hashes here.

Provenance

The following attestation bundles were made for markmaton-0.1.5-py3-none-win_amd64.whl:

Publisher: workflow.yml on appautomaton/markmaton

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file markmaton-0.1.5-py3-none-manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for markmaton-0.1.5-py3-none-manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 8ad5423775450dfe4859193a22679d7adaaa87945b15c4b91478d036b2247ca3
MD5 a02709752c3ff5b90da3706991147021
BLAKE2b-256 ffdf613e6a9b0df15db834a708cf42edd4744594f8b4b8c7b9c4acff485c122c

See more details on using hashes here.

Provenance

The following attestation bundles were made for markmaton-0.1.5-py3-none-manylinux2014_x86_64.whl:

Publisher: workflow.yml on appautomaton/markmaton

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file markmaton-0.1.5-py3-none-macosx_12_0_x86_64.whl.

File metadata

File hashes

Hashes for markmaton-0.1.5-py3-none-macosx_12_0_x86_64.whl
Algorithm Hash digest
SHA256 aa60d89254d7315791c5cc3629e8d9401be5f4a91d3430cb1d73834be7e5f21e
MD5 f3307fc82ac9d724cca11b7e61ad2499
BLAKE2b-256 e3991f16d58f0c204c26a6afe3e4f7f5030f76bf544b97efd90b1535da93dae6

See more details on using hashes here.

Provenance

The following attestation bundles were made for markmaton-0.1.5-py3-none-macosx_12_0_x86_64.whl:

Publisher: workflow.yml on appautomaton/markmaton

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file markmaton-0.1.5-py3-none-macosx_12_0_arm64.whl.

File metadata

File hashes

Hashes for markmaton-0.1.5-py3-none-macosx_12_0_arm64.whl
Algorithm Hash digest
SHA256 5f389b3285a1dc13589774d3c5187a056326bbfe874b6e870d52fe0226c31b59
MD5 096d90c6b2d7838f1c3771b535e0dbfd
BLAKE2b-256 34127140bc3703ec7e4a772ae80116d0ff9b75f8675fa6125bacb9ad8621a511

See more details on using hashes here.

Provenance

The following attestation bundles were made for markmaton-0.1.5-py3-none-macosx_12_0_arm64.whl:

Publisher: workflow.yml on appautomaton/markmaton

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page