DEPRECATED: renamed to al-warraq (pip install al-warraq). Lightweight EPUB inspection — version detection and TOC discovery.
Project description
EpubSage
⚠️ This package has been renamed to al-warraq.
epubsagewill receive no further updates. Migrate with:pip install al-warraqandimport al_warraq.
Lightweight EPUB inspection library — version detection, TOC discovery, and content extraction.
Features
| Feature | Description |
|---|---|
| Version Detection | EPUB 2.0 and 3.0 support |
| TOC Discovery | Automatic NAV (EPUB 3) and NCX (EPUB 2) detection |
| TOC Parsing | Parse navigation points with full tree structure |
| Classification | Classify entries as chapter, part, front/back matter, section |
| Content Extraction | Extract content as HTML, plaintext, or markdown |
| Security | Zip bomb detection and zip slip prevention |
| CLI | 5 commands for EPUB inspection from the terminal |
Requirements
- Python 3.10+
- Dependencies:
markdownify,typer
Installation
pip install epubsage
Or with uv:
uv add epubsage
Quick Start
Python
from epubsage import inspect_epub
info = inspect_epub("book.epub")
print(f"Title: {info.title}")
print(f"Version: EPUB {info.version}")
print(f"TOC: {info.toc.toc_type}")
Command Line
epubsage inspect book.epub
CLI Commands
epubsage --help
| Command | Description |
|---|---|
inspect |
Display EPUB version, TOC type, and title |
extract |
Extract EPUB contents to a directory |
validate |
Validate ZIP structure, OPF, and TOC |
toc |
Display table of contents as a classified tree |
content |
Extract content for a specific section |
Shell Auto-Completion
epubsage --install-completion
Supported shells: Bash, Zsh, Fish, PowerShell
Python API
Core Functions
| Function | Description |
|---|---|
inspect_epub() |
One-step inspection: hash, extract, parse OPF, detect TOC |
hash_epub() |
SHA-256 hash of EPUB file |
extract_epub() |
Extract EPUB ZIP to directory |
find_opf() |
Find the .opf file in extracted EPUB |
parse_opf() |
Parse OPF for version and TOC info |
parse_ncx() |
Parse NCX file for navigation map |
parse_nav() |
Parse EPUB 3 NAV document |
classify_navpoint() |
Classify a NavPoint by label |
classify_children() |
Classify children by nesting depth |
extract_content() |
Extract content as HTML, plaintext, or markdown |
Data Types
| Type | Description |
|---|---|
EpubInfo |
Inspection result: version, TOC, OPF path, title |
TocInfo |
TOC detection: type, paths |
NavPoint |
Navigation point: label, file, anchor, children, type |
NcxData |
Parsed NCX: doc title, nav points |
Architecture
epubsage/
├── __init__.py # Public API: inspect_epub + re-exports
├── classify.py # classify_navpoint, classify_children
├── cli.py # CLI: inspect, extract, validate, toc, content
├── content.py # extract_content (HTML, plaintext, markdown)
├── epub.py # hash_epub, extract_epub, find_opf
├── exceptions.py # EpubSageError, InvalidEpubError
├── nav.py # parse_nav (EPUB 3 NAV)
├── ncx.py # parse_ncx, NavPoint, NcxData
└── opf.py # parse_opf, EpubInfo, TocInfo
Development
git clone https://github.com/Abdullah-Wex/epubsage.git
cd epubsage
uv sync
make lint # ruff
make typecheck # mypy
make security # bandit
make test # pytest
make quality # all checks
Documentation
| Document | Description |
|---|---|
| API Reference | Complete Python API documentation |
| CLI Reference | All CLI commands and options |
| Examples | Practical usage examples |
| Changelog | Version history |
License
MIT License. See LICENSE for details.
Contributing
Contributions welcome! See CONTRIBUTING.md for guidelines.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file epubsage-0.7.1.tar.gz.
File metadata
- Download URL: epubsage-0.7.1.tar.gz
- Upload date:
- Size: 31.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6f9a25b2e7c49240179a6c75d587d98b1c6cc4954a863665d90edfe7bcf61313
|
|
| MD5 |
bf15ab143417c46fdf687ccfc300fa77
|
|
| BLAKE2b-256 |
bf7d91709050e5057c285a92fd68d676729ca8a106d0a40e1da00f15c2440a46
|
Provenance
The following attestation bundles were made for epubsage-0.7.1.tar.gz:
Publisher:
publish.yml on Abdullah-Wex/epubsage
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
epubsage-0.7.1.tar.gz -
Subject digest:
6f9a25b2e7c49240179a6c75d587d98b1c6cc4954a863665d90edfe7bcf61313 - Sigstore transparency entry: 2063598506
- Sigstore integration time:
-
Permalink:
Abdullah-Wex/epubsage@9ed5319927298f6534f8781d17ab47e3b9942e11 -
Branch / Tag:
refs/tags/v0.7.1 - Owner: https://github.com/Abdullah-Wex
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@9ed5319927298f6534f8781d17ab47e3b9942e11 -
Trigger Event:
release
-
Statement type:
File details
Details for the file epubsage-0.7.1-py3-none-any.whl.
File metadata
- Download URL: epubsage-0.7.1-py3-none-any.whl
- Upload date:
- Size: 37.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8d1c4805c2b027e86b494c0fdb6e1ce2ba05442cb958f382d12b6da89a791151
|
|
| MD5 |
4458f2d322fe116ef43b741527ba100b
|
|
| BLAKE2b-256 |
098902bd8efe2a60fec6fcfae1b99049bfbe105537c993aeee47166eb36b6f53
|
Provenance
The following attestation bundles were made for epubsage-0.7.1-py3-none-any.whl:
Publisher:
publish.yml on Abdullah-Wex/epubsage
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
epubsage-0.7.1-py3-none-any.whl -
Subject digest:
8d1c4805c2b027e86b494c0fdb6e1ce2ba05442cb958f382d12b6da89a791151 - Sigstore transparency entry: 2063598533
- Sigstore integration time:
-
Permalink:
Abdullah-Wex/epubsage@9ed5319927298f6534f8781d17ab47e3b9942e11 -
Branch / Tag:
refs/tags/v0.7.1 - Owner: https://github.com/Abdullah-Wex
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@9ed5319927298f6534f8781d17ab47e3b9942e11 -
Trigger Event:
release
-
Statement type: