Skip to main content

Extract article content from web platforms and return it as clean Markdown.

Project description

mdfetch

A Python library that extracts article content from web platforms and returns it as clean Markdown.

Install

pip

pip install mdfetch

Homebrew (macOS / Linux)

brew install stn1slv/tap/md-fetch

No Python environment setup required.

CLI Usage

You can use the built-in md-fetch command directly from your terminal:

# Fetch and print Markdown to standard output
md-fetch https://medium.com/example/article

# Fetch and save Markdown to a file
md-fetch https://dev.to/example/article --output article.md

Python Usage

from mdfetch import extract

# Works with any supported platform — just pass the URL
markdown = extract("https://medium.com/some-publication/article-slug-abc123")
markdown = extract("https://dev.to/username/article-slug")
markdown = extract("https://example.substack.com/p/article-slug")
markdown = extract("https://thenewstack.io/article-slug")
markdown = extract("https://dzone.com/articles/article-slug")
print(markdown)

Error handling

from mdfetch import (
    extract,
    InvalidURLError,
    UnsupportedPlatformError,
    UnsupportedContentTypeError,
    FetchError,
    HTTPStatusError,
    EmptyContentError,
)

url = "https://medium.com/some-publication/article-slug-abc123"

try:
    markdown = extract(url)
except InvalidURLError as e:
    print(f"Bad URL: {e.message}")
except UnsupportedPlatformError as e:
    print(f"Platform not supported: {e.message}")
except UnsupportedContentTypeError as e:
    print(f"Not an article page: {e.message}")
except HTTPStatusError as e:
    print(f"HTTP {e.status_code}: {e.message}")
except FetchError as e:
    print(f"Network error: {e.message}")
except EmptyContentError as e:
    print(f"No content: {e.message}")

Supported platforms

Platform Domains
Medium medium.com, *.medium.com
dev.to dev.to
Substack substack.com, *.substack.com
The New Stack thenewstack.io
DZone dzone.com

Development

Requires uv.

make setup        # install dependencies
make test         # run unit tests
make integration  # run integration tests (requires network access)
make lint         # ruff check
make format       # ruff format
make build        # build wheel + sdist
make upgrade-deps # upgrade all dependencies
make clean        # remove build artifacts

Requirements

  • Python 3.12+

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mdfetch-0.5.1.tar.gz (375.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mdfetch-0.5.1-py3-none-any.whl (16.0 kB view details)

Uploaded Python 3

File details

Details for the file mdfetch-0.5.1.tar.gz.

File metadata

  • Download URL: mdfetch-0.5.1.tar.gz
  • Upload date:
  • Size: 375.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for mdfetch-0.5.1.tar.gz
Algorithm Hash digest
SHA256 34b9c682e6b6f71f441269a4d42480b42a89208ddd494533c38b8532c133b1f9
MD5 229c0b5b3c7c381afddae465073b54a2
BLAKE2b-256 e0915b56894f85f6548eb77ed9fe9ab9bdec7edb3918b5a6a0b8d1052ee25a57

See more details on using hashes here.

Provenance

The following attestation bundles were made for mdfetch-0.5.1.tar.gz:

Publisher: publish.yml on stn1slv/md-fetch

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file mdfetch-0.5.1-py3-none-any.whl.

File metadata

  • Download URL: mdfetch-0.5.1-py3-none-any.whl
  • Upload date:
  • Size: 16.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for mdfetch-0.5.1-py3-none-any.whl
Algorithm Hash digest
SHA256 2c36e317c3aff55192d2461df09e4e2cf4b98da66cbd0fc4ca605988346b058f
MD5 8aea935bb97641057587bad52b1f5689
BLAKE2b-256 7b487969f528f1e4d8d814f5871053d4b37accdd6dd4f9bfcaa8c70b1ff0cafa

See more details on using hashes here.

Provenance

The following attestation bundles were made for mdfetch-0.5.1-py3-none-any.whl:

Publisher: publish.yml on stn1slv/md-fetch

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page