Skip to main content

Webcomic downloader

Project description

webcomix

Build StatusCoverage StatusPyPI version

Description

webcomix is a webcomic downloader that can additionally create a .cbz (Comic Book ZIP) file once downloaded.

Notice

This program is for personal use only. Please be aware that by making the downloaded comics publicly available without the permission of the author, you may be infringing upon various copyrights.

Installation

Dependencies

  • Python (3.6 or newer)
  • click
  • scrapy (Some additional steps might be required to include this package and can be found here)
  • scrapy-splash
  • scrapy-fake-useragent
  • tqdm

Process

End user

  1. Install Python 3
  2. Install the command line interface tool with pip install webcomix

Developer

  1. Install Python 3
  2. Clone this repository and open a terminal in its directory
  3. Install poetry with pip install poetry
  4. Download the dependencies by running poetry install
  5. Install pre-commit hooks with pre-commit install

Usage

webcomix [OPTIONS] COMMAND [ARGS]

Global Flags

help

Show the help message and exit.

Version

Show the version number and exit.

Commands

comics

Shows all predefined comics which can be used with the download command.

download

Downloads a predefined comic. Supports the --cbz flag, which creates a .cbz archive of the downloaded comic.

search

Searches for an XPath that can download the whole comic. Supports the --cbz flag, which creates a .cbz archive of the downloaded comic,-s, which verifies only the provided page of the comic, and -y, which skips the verification prompt.

custom

Downloads a user-defined comic. To download a specific comic, you'll need a link to the first page, an XPath expression giving out the link to the next page and an XPath expression giving out the link to the image. More info here. Supports the --cbz flag, which creates a .cbz archive of the downloaded comic, -s, which verifies only the provided page of the comic, and -y, which skips the verification prompt.

Examples

  • webcomix download xkcd
  • webcomix search xkcd --start-url=http://xkcd.com/1/
  • webcomix custom --cbz (You will be prompted about other needed arguments)
  • webcomix custom xkcd --start-url=http://xkcd.com/1/ --next-page-xpath="//a[@rel='next']/@href" --image-xpath="//div[@id='comic']//img/@src" --cbz (Same as before, but with all arguments declared beforehand)

Making an XPath selector

Using an HTML inspector, spot a html path to the next link's href attribute/comic image's src attribute.

e.g.: //div[@class='foo']/img/@src This will select the src attribute of the first image whose class is: foo

Note: webcomix works best on static websites, since scrapy(the framework we use to travel web pages) doesn't process Javascript.

To make sure your XPath is correct, you have to go into scrapy shell, which should be downloaded when you've installed webcomix.

scrapy shell <website> --> Use the website's url to go to it.
> response.body --> Will give you the html from the website.
> response.xpath --> Test an xpath selection. If you get [], this means your XPath expression hasn't gotten anything from the webpage.

Downloading comics on Javascript-heavy websites

If the webcomic's website uses javascript to render its images, you won't be able to download it using the default configuration. webcomix now has an optional flag -j on both the custom and search command to execute the javascript using scrapy-splash. In order to use it, you'll need to have Docker installed and run the following command before trying to download the comic:

docker run -p 8050:8050 scrapinghub/splash

Contribution

The procedure depends on the type of contribution:

  • If you simply want to request the addition of a comic to the list of supported comics, make an issue with the label "Enhancement".
  • If you want to request the addition of a feature to the system or a bug fix, make an issue with the appropriate label.

Running the tests

To run the tests, you have to use the pytest command in the webcomix folder.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

webcomix-3.4.tar.gz (331.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

webcomix-3.4-py3-none-any.whl (341.0 kB view details)

Uploaded Python 3

File details

Details for the file webcomix-3.4.tar.gz.

File metadata

  • Download URL: webcomix-3.4.tar.gz
  • Upload date:
  • Size: 331.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.4 CPython/3.6.7 Linux/4.15.0-1077-gcp

File hashes

Hashes for webcomix-3.4.tar.gz
Algorithm Hash digest
SHA256 b70dd0057f84532c50d0316b1a58a3e40384b5fe620b568a85a8314b54fac271
MD5 cacbfabd33916277e8707b46c707416b
BLAKE2b-256 d76df79566d95991746d80249061f80e2b88b4d2797fea0c563931a3eee136a0

See more details on using hashes here.

File details

Details for the file webcomix-3.4-py3-none-any.whl.

File metadata

  • Download URL: webcomix-3.4-py3-none-any.whl
  • Upload date:
  • Size: 341.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.4 CPython/3.6.7 Linux/4.15.0-1077-gcp

File hashes

Hashes for webcomix-3.4-py3-none-any.whl
Algorithm Hash digest
SHA256 163c727e9c29fff64d51cc9a09784711c8b2e93b12208dc07aeaa7f88d041c3a
MD5 74f502a50ace2484f8fe0038d9b461c2
BLAKE2b-256 562e4316ab1721aa012432a7f0684ba62be2445565fc487b3afe811419fe3bfb

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page