Fetch a web page and convert it into cleaned Markdown.
Project description
fetch-markdown
fetch-markdown fetches a webpage and returns cleaned Markdown, either via a
library call or a CLI command.
Much of the extraction logic is adapted from the Fetch MCP Server.
Installation
pip install fetch-markdown
Library usage
from pathlib import Path
from fetch_markdown import fetch_markdown
markdown = fetch_markdown("https://huggingface.co/unsloth/GLM-4.6-GGUF")
print(markdown[:200])
output_path = Path("/tmp/model-card.md")
fetch_markdown(
"https://huggingface.co/unsloth/GLM-4.6-GGUF",
output_path=output_path,
)
CLI usage
python -m fetch_markdown https://huggingface.co/unsloth/GLM-4.6-GGUF
# or
fetch-markdown --output output.md https://huggingface.co/unsloth/GLM-4.6-GGUF
Parameters
The library function and CLI share the same core arguments/options:
url(positional for CLI / first argument for library): target page.output_path/-o/--output PATH: optional destination file; stdout is used when omitted.force_raw/--raw: skip simplification and emit the response body verbatim.user_agent/--user-agent STRING: override the default identifier.ignore_robots_txt/--ignore-robots: skip robots.txt checks (use sparingly).proxy_url/--proxy URL: HTTP(S) proxy forwarded to httpx.timeout/--timeout SECONDS: request timeout (default 30 seconds).
Notes
- The CLI and library both fetch live webpages; network availability and site rate limits apply.
- Content extraction follows the upstream MCP
fetchserver, so results mirror that behavior when converting pages to Markdown.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file fetch_markdown-0.0.2.tar.gz.
File metadata
- Download URL: fetch_markdown-0.0.2.tar.gz
- Upload date:
- Size: 9.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
50f7c07c2bf8f02f5dd0656aab29a1f2d74c83c0050646c9f8fc4e8ade6c1ebc
|
|
| MD5 |
8a732b06aaa69e002915e2b4f77c2bc2
|
|
| BLAKE2b-256 |
8f9e095aabf29875ad38b782d24066f4057c80705e13b7bf76cc309d7a15a9b2
|
Provenance
The following attestation bundles were made for fetch_markdown-0.0.2.tar.gz:
Publisher:
ci.yml on Wuodan/fetch-markdown
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
fetch_markdown-0.0.2.tar.gz -
Subject digest:
50f7c07c2bf8f02f5dd0656aab29a1f2d74c83c0050646c9f8fc4e8ade6c1ebc - Sigstore transparency entry: 707553869
- Sigstore integration time:
-
Permalink:
Wuodan/fetch-markdown@d3163f2a8ec95f8e8c19c5133170a859969ac96b -
Branch / Tag:
refs/tags/0.0.2 - Owner: https://github.com/Wuodan
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
ci.yml@d3163f2a8ec95f8e8c19c5133170a859969ac96b -
Trigger Event:
push
-
Statement type:
File details
Details for the file fetch_markdown-0.0.2-py3-none-any.whl.
File metadata
- Download URL: fetch_markdown-0.0.2-py3-none-any.whl
- Upload date:
- Size: 6.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a0dda936c3debe3cd6ea4ce7ca19cb360c0c1f349757518bb525f1ded4d6edc4
|
|
| MD5 |
9e2c9b9613c12c6e7a19f163f37de845
|
|
| BLAKE2b-256 |
ee90436afbda08cc746b02df5e16cb49a092b2ea8997d42d4d6d3b69470a3970
|
Provenance
The following attestation bundles were made for fetch_markdown-0.0.2-py3-none-any.whl:
Publisher:
ci.yml on Wuodan/fetch-markdown
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
fetch_markdown-0.0.2-py3-none-any.whl -
Subject digest:
a0dda936c3debe3cd6ea4ce7ca19cb360c0c1f349757518bb525f1ded4d6edc4 - Sigstore transparency entry: 707553876
- Sigstore integration time:
-
Permalink:
Wuodan/fetch-markdown@d3163f2a8ec95f8e8c19c5133170a859969ac96b -
Branch / Tag:
refs/tags/0.0.2 - Owner: https://github.com/Wuodan
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
ci.yml@d3163f2a8ec95f8e8c19c5133170a859969ac96b -
Trigger Event:
push
-
Statement type: