Skip to main content

A Python library for reading and writing EPUB files.

Project description

PyPubLib

A Python library for ePub files

Python 3.10 EPUB License: MIT Code Style: Black CI codecov

This project provides tools and utilities for generating and manipulating EPUB files using Python. It includes functions to create essential EPUB components such as nav.xhtml, toc.ncx, and the manifest, making it easier to build valid EPUB 2 and EPUB 3 ebooks programmatically.

What is EPUB?

EPUB (Electronic Publication) is a widely used open standard for e-books, maintained by the W3C. EPUB files are essentially ZIP archives containing XHTML content, images, stylesheets, and metadata. The format supports reflowable content, making it suitable for various screen sizes and devices.

The structure of an EPUB file

The structure of an EPUB is rather simple: An EPUB file is a ZIP archive with a specific directory structure and required files.

The content of a book is stored in XHTML files, with CSS styles and images stored in separate directories.

The single most important file in an EPUB is the OPF file (Open Publication Format), which describes the structure of the book, including its

  • the metadata: tile, author, language, publisher, etc.
  • the manifest: the list of included files
  • the spine: defines the reading order of the book
  • an optional guide: defines references to key parts of the book

Summarizing, an EPUB file contains:

  • OPF (Open Packaging Format): Describes the structure and resources of the book.
  • XHTML files: The actual content of the book.
  • Images and stylesheets: For media and formatting.
  • nav.xhtml: Used in EPUB 3 for navigation.
  • NCX (Navigation Center eXtended): Used in EPUB 2 for the table of contents.

Features of pypublib

As mentioned before the structure of an EPUB book is rather simple, and there are already some Python libs that can help you create EPUB files. Furthermore there are some GUI tools that can help you create EPUB files, notably

  • Sigil: A great tool for visually organizing and editing single EPUB files
  • Calibre: A powerful eBook management tool that can convert various formats to EPUB and vice versa

pypublib aims to provide a simple and easy-to-use interface for creating and manipulating EPUB files programmatically. It focuses on generating the essential components of an EPUB file, such as content.opf, nav.xhtml, and toc.ncx, while allowing for easy integration with existing Python projects. EPUB books can be created from scratch or imported for modification.

Key features include:

  • Create and manipulate EPUB files programmatically.
  • Import existing EPUB files for modification.
  • Parsing and generating content.opf files.
  • Generate nav.xhtml for EPUB 3 navigation.
  • Generate toc.ncx for EPUB 2 table of contents.
  • Create and manage the manifest and spine in the OPF file.
  • Support for adding metadata to the EPUB file.
  • Easy integration with existing Python projects.

The API

The library provides a simple API for creating and manipulating EPUB files.

Classes

The main classes and functions include:

  • Book: Represents an EPUB book, with methods to add chapters, images, and metadata.
  • Chapter: Represents a chapter in the book, with methods to set the title and content.
  • OPF: Represents the OPF file, with methods to add metadata, manifest items, and spine items. Used to import an EPUB book by parsing the OPF file.

The Book and Chapter classes are the structures / containers for creating and manipulating EPUB files.

Functions

The most important functions are:

  • read_epub: Imports an existing EPUB file and returns a Book object with all chapters, styles and images.
  • publish_epub: Publishes a generated or modified Book object as an EPUB file.
  • validate_epub: Validates the structure and contents of an EPUB file.

Usage

To create a new EPUB file, you simply

  1. Create a Book object.
  2. Create Chapter objects and add them to the book. The order of chapters defines the reading order.
  3. Set HTML content and titles for each chapter. The content is simply the BODY of the XHTML file.
  4. Add stylesheets and images to Chapters and Book as desired.
  5. Set metadata for the book.
  6. Finally, call publish_epub to generate the EPUB file.

Example - Create a simple EPUB file

from pypublib.book import Book, Chapter
from pypublib.epub import publish_book, validate_book

book = Book()
chapter1 = Chapter.from_content(href="Chapter1.xhtml", title="Chapter 1", content="<1>Hello world!</h1>",
                                styles="styles.css")
chapter1.content += "<p>This is the first chapter of the book.</p>"
book.add_chapter(chapter1)
book.add_style("styles.css", "body { font-family: Arial, sans-serif; }")
book.title = "My First EPUB"
book.author = "John Doe"
validate_book(book)
publish_book(book, "book.epub")

Further Examples

The directory examples contains some example scripts demonstrating the use of the library to create and manipulate EPUB files.

Requirements

pypublib needs only lxml for parsing and generating XML/HTML files. It supports Python 3.10 and higher.

  • Python 3.10 or higher
  • lxml~=5.2.2

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pypublib-0.1.0.tar.gz (1.6 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pypublib-0.1.0-py3-none-any.whl (21.3 kB view details)

Uploaded Python 3

File details

Details for the file pypublib-0.1.0.tar.gz.

File metadata

  • Download URL: pypublib-0.1.0.tar.gz
  • Upload date:
  • Size: 1.6 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for pypublib-0.1.0.tar.gz
Algorithm Hash digest
SHA256 ae1acc0d606e18dd788ce9f8ab2fac9d9a49e86eb43993bf3281a83424dbb24b
MD5 a9b76e02c616b6ffa7169e41f0e0aca0
BLAKE2b-256 fc1654d3ddbdb2c3223dc15a478a8905168a29d76e699886e7e190f0ac78d839

See more details on using hashes here.

File details

Details for the file pypublib-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: pypublib-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 21.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for pypublib-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 96fbd27a19ff5b91dc2c72c010a024293a728486808250d67feff586a4566d67
MD5 c3b899bfbe178135107c934885183743
BLAKE2b-256 87dde7d17637835d81f7597e1dc551320b756d1a55d9aa2f6f5d1d946b297efd

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page