A Python CLI and utility library for manipulating EPUB files
Project description
epub-utils
A Python library and CLI tool for inspecting ePub from the terminal.
Features
- Parse and validate EPUB container and package files
- Extract metadata like title, author, and identifier
- Command-line interface for quick file inspection
- Syntax highlighted XML output
Installation
pip install epub-utils
Use as a CLI tool
The basic format is:
epub-utils EPUB_PATH COMMAND [OPTIONS]
Commands
-
container- Display the container.xml contents# Show container.xml with syntax highlighting epub-utils book.epub container # Show container.xml as raw content epub-utils book.epub container --format raw
-
package- Display the package OPF file contents# Show package.opf with syntax highlighting epub-utils book.epub package # Show package.opf as raw content epub-utils book.epub package --format raw
-
toc- Display the table of contents file contents# Show toc.ncx/nav.xhtml with syntax highlighting epub-utils book.epub toc # Show toc.ncx/nav.xhtml as raw content epub-utils book.epub toc --format raw
-
metadata- Display the metadata information from the package file# Show metadata with syntax highlighting epub-utils book.epub metadata # Show metadata as key-value pairs epub-utils book.epub metadata --format kv
-
manifest- Display the manifest information from the package file# Show manifest with syntax highlighting epub-utils book.epub manifest # Show manifest as raw content epub-utils book.epub manifest --format raw
-
spine- Display the spine information from the package file# Show spine with syntax highlighting epub-utils book.epub spine # Show spine as raw content epub-utils book.epub spine --format raw
-
content- Display the content of a document by its manifest item ID# Show content with syntax highlighting epub-utils book.epub content chapter1 # Show raw HTML/XML content epub-utils book.epub content chapter1 --format raw # Show plain text content (HTML tags stripped) epub-utils book.epub content chapter1 --format plain
Options
-
-h, --help- Show help message and exit -
-v, --version- Show program version and exit -
-fmt, --format- Output format (default: xml)xml- Display with XML syntax highlighting (default)raw- Display raw content without formattingplain- Display plain text content (HTML tags stripped, for content command only)kv- Display key-value pairs (where supported)
# Display as raw content epub-utils book.epub package --format raw # Display with XML syntax highlighting (default) epub-utils book.epub package --format xml # Display as key-value pairs (for supported commands) epub-utils book.epub metadata --format kv # Display plain text content (content command only) epub-utils book.epub content chapter1 --format plain
Use as a Python library
from epub_utils import Document
# Load an EPUB document
doc = Document("path/to/book.epub")
# Get raw XML content
print(doc.container)
print(doc.package)
print(doc.toc)
# Access package metadata
print(f"Title: {doc.package.metadata.title}")
print(f"Creator: {doc.package.metadata.creator}")
print(f"Identifier: {doc.package.metadata.identifier}")
Industry Standards & Compliance
epub-utils provides comprehensive support for industry-standard ePub specifications and related technologies, ensuring broad compatibility across the digital publishing ecosystem.
Supported EPUB Standards
-
EPUB 2.0.1 (IDPF, 2010)
- Complete OPF 2.0 package document support
- NCX navigation control file support
- Dublin Core metadata extraction
- Legacy EPUB compatibility
-
EPUB 3.0+ (IDPF/W3C, 2011-present)
- EPUB 3.3 specification compliance
- HTML5-based content documents
- Navigation document (nav.xhtml) support
- Enhanced accessibility features
- Media overlays and scripting support
Metadata Standards
-
Dublin Core Metadata Initiative (DCMI)
- Dublin Core Metadata Element Set v1.1
- Dublin Core Metadata Terms (DCTERMS)
-
Open Packaging Format (OPF)
- OPF 2.0 specification (EPUB 2.0.1)
- OPF 3.0 specification (EPUB 3.0+)
The library maintains strict adherence to published specifications while providing robust handling of real-world EPUB variations commonly found in commercial and open-source reading applications.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file epub_utils-0.0.0a4.tar.gz.
File metadata
- Download URL: epub_utils-0.0.0a4.tar.gz
- Upload date:
- Size: 20.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
492adb6c2876d0dd708a04661177c0239dd1e3a65bf65049085c7ae95bfa6081
|
|
| MD5 |
372e94f0540a5777d5938a279e75f2d7
|
|
| BLAKE2b-256 |
a7242a6c45dd2bebd1d7ab9532de45a51c00c79cb9dc63b2b5017cb0a9c22901
|
File details
Details for the file epub_utils-0.0.0a4-py3-none-any.whl.
File metadata
- Download URL: epub_utils-0.0.0a4-py3-none-any.whl
- Upload date:
- Size: 21.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f44a8d7bd09430e53dae1ecc3d677045b06055f3113be10742c752763fc79834
|
|
| MD5 |
b101f33c77a87033e7ccac5cf317c971
|
|
| BLAKE2b-256 |
b84a73208124144d29d7c2ad15f0ae9f07d7dc984d76b58d7472a46436f2b76a
|