Skip to main content

Download a webpage and save its content as Markdown, organizing the output on disk to mirror the webpage's domain and path.

Project description

save-page-as-md

Download a webpage and save its content as Markdown, organizing the output on disk to mirror the webpage's domain and path.

Installation

pip install save-page-as-md

After installation, run it as save_page_as_md.

Usage

save_page_as_md <URL> [-o OUTPUT]
  • -o OUTPUT, --output OUTPUT: Optional. File path to write output Markdown.
    • Use - to print to stdout.
    • If not provided, output is saved in a folder structure based on the URL (see below).

Examples

Save the Python Wikipedia page in a folder structure mirroring its URL:

save_page_as_md https://en.wikipedia.org/wiki/Python_(programming_language)

Output will be placed in the directory:

./en.wikipedia.org/wiki/Python_(programming_language)/index.md

Save to a custom file:

save_page_as_md https://example.com -o example.md

Print Markdown to the terminal:

save_page_as_md https://example.com -o -

Caveats

  • For sites that require login or have JavaScript-loaded content, results may be incomplete.
  • Only HTTP and HTTPS are supported.
  • Some rare filenames/paths may be problematic on Windows due to reserved device names (e.g., CON, AUX) - review output on Windows systems.

Contributing

Contributions are welcome! Please submit pull requests or open issues on the GitHub repository.

License

This project is licensed under the MIT License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

save_page_as_md-0.1.0a1.tar.gz (3.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

save_page_as_md-0.1.0a1-py2.py3-none-any.whl (3.9 kB view details)

Uploaded Python 2Python 3

File details

Details for the file save_page_as_md-0.1.0a1.tar.gz.

File metadata

  • Download URL: save_page_as_md-0.1.0a1.tar.gz
  • Upload date:
  • Size: 3.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.12

File hashes

Hashes for save_page_as_md-0.1.0a1.tar.gz
Algorithm Hash digest
SHA256 00cac479277f903b371766a48dcdd79d4bf553b6395852960e56279daf436769
MD5 07604bf099d1e772053c39390cd53d70
BLAKE2b-256 ef2c01d527186bdb291d92378c59632b01b035798e1fdb5fea2dc00884271bed

See more details on using hashes here.

File details

Details for the file save_page_as_md-0.1.0a1-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for save_page_as_md-0.1.0a1-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 bfec7cc68a43122b6730a5b135be641d66f7f65c428ef258aff42049d29d363e
MD5 9bb1ee8e82db41c2a665674d8a996233
BLAKE2b-256 7b4830303c4a93163fc721baefe9cf4f9e47e071a6171374ee9eaf98ea8476bd

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page