Skip to main content

Markdown articles downloader and converter

Project description

Python package License Stargazers Forks Latest Release

Markdown articles tool 0.1.3

Free command line utility, written in Python, designed to help you manage online and downloaded Markdown documents (e.g., articles). The Markdown Articles Tool is available for macOS, Windows, and Linux.

Tool can be used:

  • To download Markdown documents with images and:
    • Find all image links, download images and fix links in the document.
    • Can skip broken links.
    • Deduplicate similar images by content hash or using hash as a name.
  • Support images, linked with HTML <img> tag.
  • Support local image files.
  • Convert Markdown documents to:
    • HTML.
    • PDF.
    • Or save in the plain Markdown.

Also, if you want to use separate functions, you can just import the package.

Installation

From the repository

You need Python 3.9+.

Run:

git clone "https://github.com/artiomn/markdown_articles_tool"
pip3 install -r markdown_articles_tool/requirements.txt

From the PIP

pip3 install markdown-tool

Usage

Syntax:

markdown_tool [options] <article_file_path_or_url>

options:
  -h, --help            show this help message and exit
  -D {disabled,names_hashing,content_hash}, --deduplication-type {disabled,names_hashing,content_hash}
                        Deduplicate images, using content hash or SHA1(image_name) (default: disabled)
  -d IMAGES_DIRNAME, --images-dirname IMAGES_DIRNAME
                        Folder in which to download images (possible variables: $article_name, $time, $date, $dt, $base_url) (default: images)
  -a, --skip-all-incorrect
                        skip all incorrect images (default: False)
  -E, --download-incorrect-mime
                        download "images" with unrecognized MIME type (default: False)
  -s SKIP_LIST, --skip-list SKIP_LIST
                        skip URL's from the comma-separated list (or file with a leading '@') (default: None)
  -i {md,html,md+html,html+md}, --input-format {md,html,md+html,html+md}
                        input format (default: md)
  -l, --process-local-images
                        [DEPRECATED] Process local images (default: False)
  -n, --replace-image-names
                        Replace image names, using content hash (default: False)
  -o {md,html}, --output-format {md,html}
                        output format (default: md)
  -p IMAGES_PUBLIC_PATH, --images-public-path IMAGES_PUBLIC_PATH
                        Public path to the folder of downloaded images (possible variables: $article_name, $time, $date, $dt, $base_url)
  -P, --prepend-images-with-path
                        Save relative images paths (default: False)
  -R, --remove-source   Remove or replace source file (default: False)
  -t DOWNLOADING_TIMEOUT, --downloading-timeout DOWNLOADING_TIMEOUT
                        how many seconds to wait before downloading will be failed (default: -1)
  -O OUTPUT_PATH, --output-path OUTPUT_PATH
                        article output file name or path
  --verbose, -v         More verbose logging (default: False)
  --version             return version number

Example 1:

./markdown_tool.py nc-1-zfs/article.md

Example 2:

./markdown_tool.py not-nas/sov/article.md -o html -s "http://www.ossec.net/_images/ossec-arch.jpg" -a

Example 3 (run on a folder):

find content/ -name "*.md" | xargs -n1 ./markdown_tool.py

Changes

0.0.8

-D (deduplication) option was changed in the version 0.0.8. Now option is not boolean, it has several values: "disabled", "names_hashing", "content_hash". Long option name was changed too: now it's deduplication-type.

0.1.2

  • -l, --process-local-images deprecated from the version 0.1.2 and will not work: local images will always be processed.
  • Images with unrecognized MIME type will not be downloaded by default (use -E to disable this behaviour).
  • New option -P, --prepend-images-with-path changes image output path structure. If this option is enabled, "remote" image path will be saved in the local directory structure.
  • Code was significantly refactored.
  • Some auto tests were added.

0.1.3

  • Mostly technical fixes, necessary to work GUI tool.
  • Now the tool has Qt-based GUI.

Internals

Tools is a pipeline, which get Markdown form the source and process them, using blocks:

  • Source download article.
  • ImageDownloader download every image. Inside may be used image deduplicator blocks applied to the image.
  • Transform article file, i.e. fix images URLs.
  • Format article to the specific format (Markdown, HTML, PDF, etc.), using selected formatters.

ArticleProcessor class is a strategy, applies blocks, based on the parameters (from the CLI, for example).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

markdown-tool-0.1.3.tar.gz (21.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

markdown_tool-0.1.3-py3-none-any.whl (30.0 kB view details)

Uploaded Python 3

File details

Details for the file markdown-tool-0.1.3.tar.gz.

File metadata

  • Download URL: markdown-tool-0.1.3.tar.gz
  • Upload date:
  • Size: 21.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.11.2

File hashes

Hashes for markdown-tool-0.1.3.tar.gz
Algorithm Hash digest
SHA256 880fd1a265a69619f8037b26a3fbf4416a99e2949b50ef567391ed2400cec604
MD5 20c9c2f024029d11e2fd0e9fd324268c
BLAKE2b-256 beb9bf827b11c3f1de4eeb53740c13e84a593180eeaf27021b295f20135cf4b0

See more details on using hashes here.

File details

Details for the file markdown_tool-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: markdown_tool-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 30.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.11.2

File hashes

Hashes for markdown_tool-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 d00d1a38dfeb5fa5b503c742cb48034b178ceac17a8942148efe994b29b02ea0
MD5 8efc4bb82fa54fb00daa3eb537ef1253
BLAKE2b-256 1b55b32e5e0ac40e11534c3a6910268b3f87042d76fe420243182965bbd695d0

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page