Make ZIM file from Gutenberg books
Project description
A scraper that downloads the whole repository of [Project Gutenberg] (http://www.gutenberg.org) and puts it into a locally browsable directory and then in a ZIM file (http://www.openzim.org), a clean and user friendly format for storing content for offline usage.
Dependencies
Ubuntu/debian
python-pip python-dev libxml2-dev libxslt-dev advancecomp jpegoptim pngquant p7zip-full gifsicle
macOS
brew install advancecomp jpegoptim pngquant p7zip gifsicle
Usage
gutenberg2zim
By default (no argument), it runs all the steps: download, parse, export and zim.
-h --help Display this help message
-y --wipe-db Do not wipe the DB during parse stage
-F --force Redo step even if target already exist
-l --languages=<list> Comma-separated list of lang codes to filter export to (preferably ISO 639-1, else ISO 639-3)
-f --formats=<list> Comma-separated list of formats to filter export to (epub, html, pdf, all)
-m --mirror=<url> Use URL as base for all downloads.
-r --rdf-folder=<folder> Don't download rdf-files.tar.bz2 and use extracted folder instead
-e --static-folder=<folder> Use-as/Write-to this folder static HTML
-z --zim-file=<file> Write ZIM into this file path
-t --zim-title=<title> Set ZIM title
-n --zim-desc=<description> Set ZIM description
-d --dl-folder=<folder> Folder to use/write-to downloaded ebooks
-u --rdf-url=<url> Alternative rdf-files.tar.bz2 URL
-b --books=<ids> Execute the processes for specific books, separated by commas, or dashes for intervals
-c --concurrency=<nb> Number of concurrent process for download and parsing tasks
-x --zim-title=<title> Custom title for the ZIM file
-q --zim-desc=<desc> Custom description for the ZIM file
--check Check dependencies
--prepare Download & extract rdf-files.tar.bz2
--parse Parse all RDF files and fill-up the DB
--download Download ebooks based on filters
--export Export downloaded content to zim-friendly static HTML
--dev Exports *just* Home+JS+CSS files (overwritten by --zim step)
--zim Create a ZIM file
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
gutenberg2zim-2.1.1.tar.gz
(1.5 MB
view details)
Built Distribution
File details
Details for the file gutenberg2zim-2.1.1.tar.gz
.
File metadata
- Download URL: gutenberg2zim-2.1.1.tar.gz
- Upload date:
- Size: 1.5 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/4.0.2 CPython/3.11.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | ca8402a81c905622217199001ba587a39768980909203dee93c80b23752135c8 |
|
MD5 | 6d17a56353adad5e6b47c225e3095a21 |
|
BLAKE2b-256 | 5661b6df994e6b90c8f6daa815c6475c839f0137639cf2c4179cfd3403f342c9 |
File details
Details for the file gutenberg2zim-2.1.1-py3-none-any.whl
.
File metadata
- Download URL: gutenberg2zim-2.1.1-py3-none-any.whl
- Upload date:
- Size: 1.4 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/4.0.2 CPython/3.11.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7862c263521aff21f55bb74429b225c368ccbf06f204421f6f7e61e0c1acf63e |
|
MD5 | a56aedc86729cf133d3ab27c699879eb |
|
BLAKE2b-256 | c4cee3071df62f4b2676a8d67c70229c4f5225e7ec980bc1099a37d812e96cfc |