Lightweight HTML-to-Markdown tooling for agent workflows.
Project description
markmaton
markmaton is a lightweight HTML-to-Markdown parser for agent workflows.
It takes already-fetched page HTML, cleans the structure, and returns Markdown plus page metadata.
[!NOTE]
markmatonis a general parser, not a crawler. Feed it HTML from Playwright,fetch, Firecrawl, or another upstream page-visit tool.
Why it exists
- Keep the parser core narrow and deterministic.
- Accept both fetched HTML and rendered HTML.
- Make HTML-to-Markdown robust enough for real agent workflows.
- Ship a simple Python CLI around a Go engine.
Install
pip
pip install markmaton
uv tool
uv tool install markmaton
[!TIP]
markmatonitself now develops as auv-managed Python 3.12 project. The installed package still works through plainpip, but local development assumesuv.
Quickstart
CLI
markmaton convert \
--html-file page.html \
--url https://example.com/article \
--output-format markdown
To get the full structured response:
markmaton convert \
--html-file page.html \
--url https://example.com/article \
--output-format json
Python API
from markmaton import ConvertOptions, ConvertRequest, convert_html
html = "<article><h1>Hello</h1><p>World</p></article>"
response = convert_html(
ConvertRequest(
html=html,
url="https://example.com/article",
options=ConvertOptions(only_main_content=True),
)
)
print(response.markdown)
print(response.metadata.title)
[!TIP] Pass
urlwhenever you can.markmatonuses it as parsing context for canonical metadata and absolute link resolution.
What you get back
The JSON response includes:
markdownhtml_cleanmetadatalinksimagesquality
This keeps the parser useful both as a Markdown generator and as a page-normalization step in a larger workflow.
Project shape
- Go engine:
cmd/markmaton-engine - Python wrapper and CLI:
markmaton/ - Parser fixtures and golden files:
testdata/ - Research, benchmark, and release docs:
docs/
Documentation
Start here:
- Documentation index
- Usage guide
- Packaging layout
- PyPI release path
- Benchmark workflow
- Benchmark matrix
Development
Set up the local development environment:
uv sync --group dev
Run the core test suites:
uv run python -m unittest discover -s tests -p 'test_*.py'
go test ./...
For a manual end-to-end smoke:
The repo is pinned to:
- Python
3.12via.python-version - a committed
uv.lock
[!IMPORTANT] Automated coverage stays unit-test-first. Live page visits and benchmark sampling are intentionally kept out of the default automated test path.
Release notes
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file markmaton-0.1.5.tar.gz.
File metadata
- Download URL: markmaton-0.1.5.tar.gz
- Upload date:
- Size: 356.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
98a7b1006978927363821fd4baad2c51f4a38f94a0b9813334ca84d1c93bb777
|
|
| MD5 |
c7de07eeee1e73e12d1a2c47bfdac3e8
|
|
| BLAKE2b-256 |
6c1939e71e595d05a2da0350dcd9064e2b4a0b83820d966c4fc5dcdf02fc342a
|
Provenance
The following attestation bundles were made for markmaton-0.1.5.tar.gz:
Publisher:
workflow.yml on appautomaton/markmaton
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
markmaton-0.1.5.tar.gz -
Subject digest:
98a7b1006978927363821fd4baad2c51f4a38f94a0b9813334ca84d1c93bb777 - Sigstore transparency entry: 1262775522
- Sigstore integration time:
-
Permalink:
appautomaton/markmaton@53a895737c971a071e9f35676d0b16af1faef329 -
Branch / Tag:
refs/tags/v0.1.5 - Owner: https://github.com/appautomaton
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
workflow.yml@53a895737c971a071e9f35676d0b16af1faef329 -
Trigger Event:
push
-
Statement type:
File details
Details for the file markmaton-0.1.5-py3-none-win_amd64.whl.
File metadata
- Download URL: markmaton-0.1.5-py3-none-win_amd64.whl
- Upload date:
- Size: 3.9 MB
- Tags: Python 3, Windows x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
aeb4fb28cb16f6a226dbc6db53b0949c78998a06c83563705f863b5460526957
|
|
| MD5 |
65b9dbc786fa3a3d376c0d6cbd741782
|
|
| BLAKE2b-256 |
6f43b96ce162570e4faee3f822b4309872f9820a2d5e2b641a4d6f0cc3d40cfa
|
Provenance
The following attestation bundles were made for markmaton-0.1.5-py3-none-win_amd64.whl:
Publisher:
workflow.yml on appautomaton/markmaton
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
markmaton-0.1.5-py3-none-win_amd64.whl -
Subject digest:
aeb4fb28cb16f6a226dbc6db53b0949c78998a06c83563705f863b5460526957 - Sigstore transparency entry: 1262775533
- Sigstore integration time:
-
Permalink:
appautomaton/markmaton@53a895737c971a071e9f35676d0b16af1faef329 -
Branch / Tag:
refs/tags/v0.1.5 - Owner: https://github.com/appautomaton
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
workflow.yml@53a895737c971a071e9f35676d0b16af1faef329 -
Trigger Event:
push
-
Statement type:
File details
Details for the file markmaton-0.1.5-py3-none-manylinux2014_x86_64.whl.
File metadata
- Download URL: markmaton-0.1.5-py3-none-manylinux2014_x86_64.whl
- Upload date:
- Size: 3.9 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8ad5423775450dfe4859193a22679d7adaaa87945b15c4b91478d036b2247ca3
|
|
| MD5 |
a02709752c3ff5b90da3706991147021
|
|
| BLAKE2b-256 |
ffdf613e6a9b0df15db834a708cf42edd4744594f8b4b8c7b9c4acff485c122c
|
Provenance
The following attestation bundles were made for markmaton-0.1.5-py3-none-manylinux2014_x86_64.whl:
Publisher:
workflow.yml on appautomaton/markmaton
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
markmaton-0.1.5-py3-none-manylinux2014_x86_64.whl -
Subject digest:
8ad5423775450dfe4859193a22679d7adaaa87945b15c4b91478d036b2247ca3 - Sigstore transparency entry: 1262775564
- Sigstore integration time:
-
Permalink:
appautomaton/markmaton@53a895737c971a071e9f35676d0b16af1faef329 -
Branch / Tag:
refs/tags/v0.1.5 - Owner: https://github.com/appautomaton
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
workflow.yml@53a895737c971a071e9f35676d0b16af1faef329 -
Trigger Event:
push
-
Statement type:
File details
Details for the file markmaton-0.1.5-py3-none-macosx_12_0_x86_64.whl.
File metadata
- Download URL: markmaton-0.1.5-py3-none-macosx_12_0_x86_64.whl
- Upload date:
- Size: 4.0 MB
- Tags: Python 3, macOS 12.0+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
aa60d89254d7315791c5cc3629e8d9401be5f4a91d3430cb1d73834be7e5f21e
|
|
| MD5 |
f3307fc82ac9d724cca11b7e61ad2499
|
|
| BLAKE2b-256 |
e3991f16d58f0c204c26a6afe3e4f7f5030f76bf544b97efd90b1535da93dae6
|
Provenance
The following attestation bundles were made for markmaton-0.1.5-py3-none-macosx_12_0_x86_64.whl:
Publisher:
workflow.yml on appautomaton/markmaton
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
markmaton-0.1.5-py3-none-macosx_12_0_x86_64.whl -
Subject digest:
aa60d89254d7315791c5cc3629e8d9401be5f4a91d3430cb1d73834be7e5f21e - Sigstore transparency entry: 1262775555
- Sigstore integration time:
-
Permalink:
appautomaton/markmaton@53a895737c971a071e9f35676d0b16af1faef329 -
Branch / Tag:
refs/tags/v0.1.5 - Owner: https://github.com/appautomaton
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
workflow.yml@53a895737c971a071e9f35676d0b16af1faef329 -
Trigger Event:
push
-
Statement type:
File details
Details for the file markmaton-0.1.5-py3-none-macosx_12_0_arm64.whl.
File metadata
- Download URL: markmaton-0.1.5-py3-none-macosx_12_0_arm64.whl
- Upload date:
- Size: 3.8 MB
- Tags: Python 3, macOS 12.0+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5f389b3285a1dc13589774d3c5187a056326bbfe874b6e870d52fe0226c31b59
|
|
| MD5 |
096d90c6b2d7838f1c3771b535e0dbfd
|
|
| BLAKE2b-256 |
34127140bc3703ec7e4a772ae80116d0ff9b75f8675fa6125bacb9ad8621a511
|
Provenance
The following attestation bundles were made for markmaton-0.1.5-py3-none-macosx_12_0_arm64.whl:
Publisher:
workflow.yml on appautomaton/markmaton
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
markmaton-0.1.5-py3-none-macosx_12_0_arm64.whl -
Subject digest:
5f389b3285a1dc13589774d3c5187a056326bbfe874b6e870d52fe0226c31b59 - Sigstore transparency entry: 1262775582
- Sigstore integration time:
-
Permalink:
appautomaton/markmaton@53a895737c971a071e9f35676d0b16af1faef329 -
Branch / Tag:
refs/tags/v0.1.5 - Owner: https://github.com/appautomaton
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
workflow.yml@53a895737c971a071e9f35676d0b16af1faef329 -
Trigger Event:
push
-
Statement type: