A Python library for reading and writing EPUB files.
Project description
PyPubLib
A Python library for ePub files
This project provides tools and utilities for generating and manipulating EPUB files using Python.
It includes functions to create essential EPUB components such as nav.xhtml, toc.ncx, and the manifest,
making it easier to build valid EPUB 2 and EPUB 3 ebooks programmatically.
What is EPUB?
EPUB (Electronic Publication) is a widely used open standard for e-books, maintained by the W3C. EPUB files are essentially ZIP archives containing XHTML content, images, stylesheets, and metadata. The format supports reflowable content, making it suitable for various screen sizes and devices.
The structure of an EPUB file
The structure of an EPUB is rather simple: An EPUB file is a ZIP archive with a specific directory structure and required files.
The content of a book is stored in XHTML files, with CSS styles and images stored in separate directories.
The single most important file in an EPUB is the OPF file (Open Publication Format), which describes the structure of the book, including its
- the metadata: tile, author, language, publisher, etc.
- the manifest: the list of included files
- the spine: defines the reading order of the book
- an optional guide: defines references to key parts of the book
Summarizing, an EPUB file contains:
- OPF (Open Packaging Format): Describes the structure and resources of the book.
- XHTML files: The actual content of the book.
- Images and stylesheets: For media and formatting.
- nav.xhtml: Used in EPUB 3 for navigation.
- NCX (Navigation Center eXtended): Used in EPUB 2 for the table of contents.
Features of pypublib
As mentioned before the structure of an EPUB book is rather simple, and there are already some Python libs that can help you create EPUB files. Furthermore there are some GUI tools that can help you create EPUB files, notably
- Sigil: A great tool for visually organizing and editing single EPUB files
- Calibre: A powerful eBook management tool that can convert various formats to EPUB and vice versa
pypublib aims to provide a simple and easy-to-use interface for creating and manipulating EPUB files programmatically. It focuses on generating the essential components of an EPUB file, such as content.opf, nav.xhtml, and toc.ncx, while allowing for easy integration with existing Python projects. EPUB books can be created from scratch or imported for modification.
Key features include:
- Create and manipulate EPUB files programmatically.
- Import existing EPUB files for modification.
- Parsing and generating
content.opffiles. - Generate
nav.xhtmlfor EPUB 3 navigation. - Generate
toc.ncxfor EPUB 2 table of contents. - Create and manage the manifest and spine in the OPF file.
- Support for adding metadata to the EPUB file.
- Easy integration with existing Python projects.
The API
The library provides a simple API for creating and manipulating EPUB files.
Classes
The main classes and functions include:
Book: Represents an EPUB book, with methods to add chapters, images, and metadata.Chapter: Represents a chapter in the book, with methods to set the title and content.OPF: Represents the OPF file, with methods to add metadata, manifest items, and spine items. Used to import an EPUB book by parsing the OPF file.
The Book and Chapter classes are the structures / containers for creating and manipulating EPUB files.
Functions
The most important functions are:
read_epub: Imports an existing EPUB file and returns aBookobject with all chapters, styles and images.publish_epub: Publishes a generated or modifiedBookobject as an EPUB file.validate_epub: Validates the structure and contents of an EPUB file.
Usage
To create a new EPUB file, you simply
- Create a
Bookobject. - Create
Chapterobjects and add them to the book. The order of chapters defines the reading order. - Set HTML content and titles for each chapter. The content is simply the
BODYof the XHTML file. - Add stylesheets and images to
ChaptersandBookas desired. - Set metadata for the book.
- Finally, call
publish_epubto generate the EPUB file.
Example - Create a simple EPUB file
from pypublib.book import Book, Chapter
from pypublib.epub import publish_book, validate_book
book = Book()
chapter1 = Chapter.from_content(href="Chapter1.xhtml", title="Chapter 1", content="<1>Hello world!</h1>",
styles="styles.css")
chapter1.content += "<p>This is the first chapter of the book.</p>"
book.add_chapter(chapter1)
book.add_style("styles.css", "body { font-family: Arial, sans-serif; }")
book.title = "My First EPUB"
book.author = "John Doe"
validate_book(book)
publish_book(book, "book.epub")
Further Examples
The directory examples contains some example scripts demonstrating the use of the library to create and manipulate EPUB files.
Requirements
pypublib needs only lxml for parsing and generating XML/HTML files. It supports Python 3.10 and higher.
- Python 3.10 or higher
- lxml~=5.2.2
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pypublib-0.1.0.tar.gz.
File metadata
- Download URL: pypublib-0.1.0.tar.gz
- Upload date:
- Size: 1.6 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ae1acc0d606e18dd788ce9f8ab2fac9d9a49e86eb43993bf3281a83424dbb24b
|
|
| MD5 |
a9b76e02c616b6ffa7169e41f0e0aca0
|
|
| BLAKE2b-256 |
fc1654d3ddbdb2c3223dc15a478a8905168a29d76e699886e7e190f0ac78d839
|
File details
Details for the file pypublib-0.1.0-py3-none-any.whl.
File metadata
- Download URL: pypublib-0.1.0-py3-none-any.whl
- Upload date:
- Size: 21.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
96fbd27a19ff5b91dc2c72c010a024293a728486808250d67feff586a4566d67
|
|
| MD5 |
c3b899bfbe178135107c934885183743
|
|
| BLAKE2b-256 |
87dde7d17637835d81f7597e1dc551320b756d1a55d9aa2f6f5d1d946b297efd
|