Webcomic downloader
Project description
webcomix
Description
webcomix is a webcomic downloader that can additionally create a .cbz (Comic Book ZIP) file once downloaded.
Notice
This program is for personal use only. Please be aware that by making the downloaded comics publicly available without the permission of the author, you may be infringing upon various copyrights.
Installation
Dependencies
- Python (3.8 or newer)
- click
- scrapy (Some additional steps might be required to include this package and can be found here)
- scrapy-splash
- scrapy-fake-useragent
- tqdm
- Docker (To be able to download JavaScript-dependent websites with
-j
option)
Process
End user
- Install Python 3
- Install the command line interface tool with
pip install webcomix
Developer
- Install Python 3
- Clone this repository and open a terminal in its directory
- Install poetry with
pip install poetry
- Download the dependencies by running
poetry install
- Install pre-commit hooks with
pre-commit install
Usage
webcomix [OPTIONS] COMMAND [ARGS]
Global Flags
help
Show the help message and exit.
Version
Show the version number and exit.
Commands
comics
Shows all predefined comics which can be used with the download
command.
download
Downloads a predefined comic. Supports the --cbz
flag, which creates a .cbz archive of the downloaded comic.
search
Searches for an XPath that can download the whole comic. Supports the --cbz
flag, which creates a .cbz archive of the downloaded comic,-s
, which verifies only the provided page of the comic, -y
, which skips the verification prompt, and -j
, which runs the javascript on pages before downloading.
custom
Downloads a user-defined comic. To download a specific comic, you'll need a link to the first page, an XPath expression giving out the link to the next page and an XPath expression giving out the link to the image. More info here. Supports the --cbz
flag, which creates a .cbz archive of the downloaded comic, -s
, which verifies only the provided page of the comic, and -y
, which skips the verification prompt.
Examples
webcomix download xkcd
webcomix search xkcd --start-url=http://xkcd.com/1/
webcomix custom --cbz
(You will be prompted about other needed arguments)webcomix custom xkcd --start-url=http://xkcd.com/1/ --next-page-xpath="//a[@rel='next']/@href" --image-xpath="//div[@id='comic']//img/@src" --cbz
(Same as before, but with all arguments declared beforehand)
Making an XPath selector
Using an HTML inspector, spot a html path to the next link's href
attribute/comic image's src
attribute.
e.g.: //div[@class='foo']/img/@src
This will select the src attribute of the first image whose class is: foo
Note: webcomix
works best on static websites, since scrapy
(the framework we use to travel web pages) doesn't process Javascript.
To make sure your XPath is correct, you have to go into scrapy shell
, which should be downloaded when you've installed webcomix
.
scrapy shell <website> --> Use the website's url to go to it.
> response.body --> Will give you the html from the website.
> response.xpath --> Test an xpath selection. If you get [], this means your XPath expression hasn't gotten anything from the webpage.
Contribution
The procedure depends on the type of contribution:
- If you simply want to request the addition of a comic to the list of supported comics, make an issue with the label "Enhancement".
- If you want to request the addition of a feature to the system or a bug fix, make an issue with the appropriate label.
Running the tests
To run the tests, you have to use the pytest
command in the webcomix folder.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file webcomix-3.10.1.tar.gz
.
File metadata
- Download URL: webcomix-3.10.1.tar.gz
- Upload date:
- Size: 332.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.1.0 CPython/3.12.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | c372b7971e2014ecf1d8baca047296072a28d69224f5cdd48aa44847b3523fee |
|
MD5 | 5caefb2b4d22f4c8106ac03a14861f79 |
|
BLAKE2b-256 | 28c99ef52cba6b2ee07fa73cf0f0f8f65c5d51ed190761da6a1bea310fbe94f1 |
File details
Details for the file webcomix-3.10.1-py3-none-any.whl
.
File metadata
- Download URL: webcomix-3.10.1-py3-none-any.whl
- Upload date:
- Size: 343.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.1.0 CPython/3.12.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | c84b2a64b7300bfceee743e6a48c63b95b907be234e2aceb8ff2c7b02b5f21ce |
|
MD5 | 3f2edb441a36628402b3f10938b20f17 |
|
BLAKE2b-256 | 5bdfa3624e169c7d67788a2e42da979cedbfa550a30a7cba39031a374998429d |