Skip to main content

A Python CLI and utility library for manipulating EPUB files

Project description

epub-utils

PyPI Changelog Python 3.x License

A Python library and CLI tool for inspecting ePub from the terminal.

Features

  • Parse and validate EPUB container and package files
  • Extract metadata like title, author, and identifier
  • Command-line interface for quick file inspection
  • Syntax highlighted XML output

Installation

pip install epub-utils

Use as a CLI tool

The basic format is:

epub-utils EPUB_PATH COMMAND [OPTIONS]

Commands

  • container - Display the container.xml contents

    # Show container.xml with syntax highlighting
    epub-utils book.epub container
    
    # Show container.xml as raw content
    epub-utils book.epub container --format raw
    
  • package - Display the package OPF file contents

    # Show package.opf with syntax highlighting
    epub-utils book.epub package
    
    # Show package.opf as raw content
    epub-utils book.epub package --format raw
    
  • toc - Display the table of contents file contents

    # Show toc.ncx/nav.xhtml with syntax highlighting
    epub-utils book.epub toc
    
    # Show toc.ncx/nav.xhtml as raw content
    epub-utils book.epub toc --format raw
    
  • metadata - Display the metadata information from the package file

    # Show metadata with syntax highlighting
    epub-utils book.epub metadata
    
    # Show metadata as key-value pairs
    epub-utils book.epub metadata --format kv
    
  • manifest - Display the manifest information from the package file

    # Show manifest with syntax highlighting
    epub-utils book.epub manifest
    
    # Show manifest as raw content
    epub-utils book.epub manifest --format raw
    
  • spine - Display the spine information from the package file

    # Show spine with syntax highlighting
    epub-utils book.epub spine
    
    # Show spine as raw content
    epub-utils book.epub spine --format raw
    
  • content - Display the content of a document by its manifest item ID

    # Show content with syntax highlighting
    epub-utils book.epub content chapter1
    
    # Show raw HTML/XML content
    epub-utils book.epub content chapter1 --format raw
    
    # Show plain text content (HTML tags stripped)
    epub-utils book.epub content chapter1 --format plain
    

Options

  • -h, --help - Show help message and exit

  • -v, --version - Show program version and exit

  • -fmt, --format - Output format (default: xml)

    • xml - Display with XML syntax highlighting (default)
    • raw - Display raw content without formatting
    • plain - Display plain text content (HTML tags stripped, for content command only)
    • kv - Display key-value pairs (where supported)
    # Display as raw content
    epub-utils book.epub package --format raw
    
    # Display with XML syntax highlighting (default)
    epub-utils book.epub package --format xml
    
    # Display as key-value pairs (for supported commands)
    epub-utils book.epub metadata --format kv
    
    # Display plain text content (content command only)
    epub-utils book.epub content chapter1 --format plain
    

Use as a Python library

from epub_utils import Document

# Load an EPUB document
doc = Document("path/to/book.epub")

# Get raw XML content
print(doc.container)
print(doc.package)
print(doc.toc)

# Access package metadata
print(f"Title: {doc.package.metadata.title}")
print(f"Creator: {doc.package.metadata.creator}")
print(f"Identifier: {doc.package.metadata.identifier}")

Industry Standards & Compliance

epub-utils provides comprehensive support for industry-standard ePub specifications and related technologies, ensuring broad compatibility across the digital publishing ecosystem.

Supported EPUB Standards

  • EPUB 2.0.1 (IDPF, 2010)

    • Complete OPF 2.0 package document support
    • NCX navigation control file support
    • Dublin Core metadata extraction
    • Legacy EPUB compatibility
  • EPUB 3.0+ (IDPF/W3C, 2011-present)

    • EPUB 3.3 specification compliance
    • HTML5-based content documents
    • Navigation document (nav.xhtml) support
    • Enhanced accessibility features
    • Media overlays and scripting support

Metadata Standards

  • Dublin Core Metadata Initiative (DCMI)

    • Dublin Core Metadata Element Set v1.1
    • Dublin Core Metadata Terms (DCTERMS)
  • Open Packaging Format (OPF)

    • OPF 2.0 specification (EPUB 2.0.1)
    • OPF 3.0 specification (EPUB 3.0+)

The library maintains strict adherence to published specifications while providing robust handling of real-world EPUB variations commonly found in commercial and open-source reading applications.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

epub_utils-0.0.0a4.tar.gz (20.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

epub_utils-0.0.0a4-py3-none-any.whl (21.2 kB view details)

Uploaded Python 3

File details

Details for the file epub_utils-0.0.0a4.tar.gz.

File metadata

  • Download URL: epub_utils-0.0.0a4.tar.gz
  • Upload date:
  • Size: 20.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.3

File hashes

Hashes for epub_utils-0.0.0a4.tar.gz
Algorithm Hash digest
SHA256 492adb6c2876d0dd708a04661177c0239dd1e3a65bf65049085c7ae95bfa6081
MD5 372e94f0540a5777d5938a279e75f2d7
BLAKE2b-256 a7242a6c45dd2bebd1d7ab9532de45a51c00c79cb9dc63b2b5017cb0a9c22901

See more details on using hashes here.

File details

Details for the file epub_utils-0.0.0a4-py3-none-any.whl.

File metadata

  • Download URL: epub_utils-0.0.0a4-py3-none-any.whl
  • Upload date:
  • Size: 21.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.3

File hashes

Hashes for epub_utils-0.0.0a4-py3-none-any.whl
Algorithm Hash digest
SHA256 f44a8d7bd09430e53dae1ecc3d677045b06055f3113be10742c752763fc79834
MD5 b101f33c77a87033e7ccac5cf317c971
BLAKE2b-256 b84a73208124144d29d7c2ad15f0ae9f07d7dc984d76b58d7472a46436f2b76a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page