A comprehensive EPUB processing toolkit for Python
Project description
epubkit
A comprehensive EPUB processing toolkit for Python.
Features
- EPUB Reading: Parse EPUB files with support for EPUB2/3 standards
- Table of Contents: Extract and navigate book structure
- Content Access: Read chapters and access metadata
- CFI Support: Canonical Fragment Identifier parsing and generation
- Layout Properties: Access EPUB3 layout and rendering properties
- Enhanced Spine: Rich spine item information with properties
- Hook System: Extensible event system for customization
- Extensible: Plugin architecture for different content sources
Installation
pip install epubkit
Quick Start
import epubkit
# Open an EPUB file
book = epubkit.open("my_book.epub")
# Access metadata
print(f"Title: {book.title}")
# Navigate chapters
for chapter in book.spine:
print(f"- {chapter['title']}")
# Read content
content = book.read_chapter("chapter1.xhtml")
# CFI support
cfi = epubkit.CFIGenerator.generate_cfi(spine_index, node, offset)
Enhanced Features
Layout Properties
Access EPUB3 layout and rendering properties from the OPF file:
# Get layout properties
layout = book.layout_properties
print(f"Layout: {layout.get('layout')}") # pre-paginated/reflowable
print(f"Flow: {layout.get('flow')}") # paginated/scrolled
print(f"Reading direction: {layout.get('page_progression_direction')}") # ltr/rtl
Enhanced Spine Items
Access rich spine item information with properties:
# Get enhanced spine items
for item in book.spine_items:
print(f"Chapter: {item.href}")
print(f"Linear: {item.linear}") # True/False
print(f"Properties: {item.properties}") # ['rendition:layout-pre-paginated']
Spine CFI Generation
Generate CFIs for spine items (useful for bookmarks):
# Generate CFI for first chapter
cfi = book.get_spine_cfi(0)
print(f"Chapter CFI: {cfi}") # epubcfi(/6/2[chapter1]/)
Hook System
Use the extensible hook system for customization:
# Register event handlers
def on_toc_built(epub_instance, toc_data):
print(f"TOC built with {len(toc_data['raw_chapters'])} chapters")
def on_chapter_loaded(epub_instance, href, content):
print(f"Loaded chapter: {href}")
book.hooks.toc_built.register(on_toc_built)
book.hooks.chapter_loaded.register(on_chapter_loaded)
# Hooks are triggered automatically during normal operations
_ = book.toc # Triggers toc_built hook
content = book.read_chapter("chapter1.xhtml") # Triggers chapter_loaded hook
Available hooks:
toc_built: After TOC is builttoc_parsed: After TOC structure is parsedmetadata_extracted: After metadata is extractedlayout_parsed: After layout properties are parsedspine_processed: After spine items are processedcontent_parsed: After chapter content is parsedchapter_loaded: After chapter is loaded from disk
License
MIT License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file epubkit-0.1.0.post202601020715.tar.gz.
File metadata
- Download URL: epubkit-0.1.0.post202601020715.tar.gz
- Upload date:
- Size: 31.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2d001d660c6ecad15bc00f6cfa79dc6f240bdb98afda73f3dfb5c4a952b3f42e
|
|
| MD5 |
64df7d5f1d588131161895ebb6741d93
|
|
| BLAKE2b-256 |
a3782450e9bb1b8c0583262afe6dea1d33c2b36eacae437c39ce51fd1f7412d5
|
File details
Details for the file epubkit-0.1.0.post202601020715-py3-none-any.whl.
File metadata
- Download URL: epubkit-0.1.0.post202601020715-py3-none-any.whl
- Upload date:
- Size: 32.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3fe9c5aeab3a563b91345f6f5c41835d961f8eb71b25b6bd53d0e147d6d6ba6c
|
|
| MD5 |
ab19292f760cd4335483726a49ac4036
|
|
| BLAKE2b-256 |
a3aed349b6435a97931e5aa280903410eaa3c147039bf32f4875e866e98ec91b
|