Skip to main content

Convert ZIM files to EPUB format

Project description

Build and Release

ZIM to EPUB Converter

A Python command-line tool to convert ZIM files (used by Kiwix and others for offline content) to EPUB format for e-readers.

Features

  • Convert ZIM files to EPUB format with robust error handling
  • Option to include or exclude images
  • Automatic table of contents generation based on article names
  • Limit the number of articles to include
  • Preserves metadata from the ZIM file
  • Clean, readable formatting for e-readers
  • Handles URL-encoded paths and special characters
  • Supports various ZIM file structures and formats
  • Extracts content from main entry when standard article paths aren't available
  • Avoids duplicate images in the output EPUB

Platform Support

This package is compatible with:

  • Linux (Debian, Ubuntu, Fedora, etc.)
  • macOS

Note: Windows is not currently supported due to limitations with the libzim library.

Recent Updates

  • Improved URL handling: Added support for URL-encoded paths and special characters
  • Enhanced image processing: Fixed issues with duplicate images and improved mimetype detection
  • Better article extraction: Added multiple methods to extract articles from different ZIM file structures
  • Robust error handling: Added comprehensive error handling and fallback mechanisms
  • Detailed logging: Added verbose logging to help diagnose issues
  • CI/CD Pipeline: Added GitHub Actions for automated testing and releases

Installation

Prerequisites

  • Python 3.6 or higher
  • C++ libzim library (required for the Python bindings)
  • Linux or macOS operating system

Installing C++ libzim

macOS

brew install libzim

Debian/Ubuntu

apt-get install libzim-dev

Fedora

dnf install libzim-devel

Installing the Python package

  1. Clone this repository:

    git clone https://github.com/yourusername/pyzim2epub.git
    cd pyzim2epub
    
  2. Install the required dependencies:

    USE_SYSTEM_LIBZIM=1 pip install -r requirements.txt
    
  3. (Optional) Install the package in development mode:

    pip install -e .
    

Installing from PyPI

You can also install the package directly from PyPI:

pip install zim2epub

Usage

Basic usage

python zim2epub.py path/to/your/file.zim

This will create an EPUB file with the same name as the input file in the current directory.

Command-line Options

usage: zim2epub.py [-h] [-o OUTPUT] [--no-images] [--no-toc] [--max-articles MAX_ARTICLES] [-v] zim_file

Convert ZIM files to EPUB format

positional arguments:
  zim_file              Path to the ZIM file to convert

optional arguments:
  -h, --help            show this help message and exit
  -o OUTPUT, --output OUTPUT
                        Path for the output EPUB file (default: same as input with .epub extension)
  --no-images           Do not include images in the EPUB (default: False)
  --no-toc              Do not generate a table of contents (default: False)
  --max-articles MAX_ARTICLES
                        Maximum number of articles to include (default: None)
  -v, --verbose         Show verbose output (default: False)

Examples

Convert a ZIM file without images (useful for smaller file size):

python zim2epub.py wikipedia.zim --no-images

Convert a ZIM file with a custom output path:

python zim2epub.py wikipedia.zim -o my-wikipedia.epub

Convert only the first 100 articles of a ZIM file:

python zim2epub.py wikipedia.zim --max-articles 100

Enable verbose output for debugging:

python zim2epub.py wikipedia.zim -v

Using as a Library

You can also use the ZimToEpub class directly in your Python code:

from zim2epub import ZimToEpub

converter = ZimToEpub(
    zim_path="path/to/file.zim",
    output_path="output.epub",
    include_images=True,
    generate_toc=True,
    max_articles=None,
    verbose=True
)

output_path = converter.convert()
print(f"EPUB created at: {output_path}")

Development

Running Tests

pytest

Building the Package

python -m build

Creating a Release

  1. Update the version in setup.py
  2. Create a new tag:
    git tag -a v0.1.0 -m "Release v0.1.0"
    
  3. Push the tag:
    git push origin v0.1.0
    

The GitHub Actions workflow will automatically build and publish the release to PyPI.

Troubleshooting

If you encounter issues:

  1. Try running with the -v flag to see detailed logs
  2. Make sure you have the C++ libzim library installed
  3. Check that your ZIM file is valid and not corrupted
  4. For image issues, try using the --no-images flag

Requirements

  • Python 3.6 or higher
  • libzim (Python bindings for the ZIM file format)
  • EbookLib (for EPUB creation)
  • BeautifulSoup4 (for HTML parsing)
  • tqdm (for progress bars)
  • lxml (for XML processing)

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

  • The OpenZIM project for the libzim library
  • EbookLib for EPUB creation functionality

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

zim2epub-0.1.3.tar.gz (19.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

zim2epub-0.1.3-py3-none-any.whl (19.2 kB view details)

Uploaded Python 3

File details

Details for the file zim2epub-0.1.3.tar.gz.

File metadata

  • Download URL: zim2epub-0.1.3.tar.gz
  • Upload date:
  • Size: 19.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for zim2epub-0.1.3.tar.gz
Algorithm Hash digest
SHA256 35ef15d48139ebd100cd067de6c90a71956fc6081fc4bd14f0e7ac4b1f569154
MD5 11ace5d74cf26062a2f890b05a922b22
BLAKE2b-256 50edcb2c7e09a3102b17a6d03ebd5180d1134a7054a7560e500a20392f7c7fcf

See more details on using hashes here.

File details

Details for the file zim2epub-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: zim2epub-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 19.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for zim2epub-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 ba44ecc2cd21dae650a76fd4ced5d186f0e4651ede23449488b64f5cce669565
MD5 bb5a62a14961ca25a190d06b128673b0
BLAKE2b-256 e038b384f7d66295ec9a95651d9b3af1a5e930e76b7f7a5311b672cc853c342a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page