Skip to main content

DEPRECATED: renamed to al-warraq (pip install al-warraq). Lightweight EPUB inspection — version detection and TOC discovery.

Project description

EpubSage

⚠️ This package has been renamed to al-warraq. epubsage will receive no further updates. Migrate with: pip install al-warraq and import al_warraq.

PyPI version Python versions License: MIT

Lightweight EPUB inspection library — version detection, TOC discovery, and content extraction.

Features

Feature Description
Version Detection EPUB 2.0 and 3.0 support
TOC Discovery Automatic NAV (EPUB 3) and NCX (EPUB 2) detection
TOC Parsing Parse navigation points with full tree structure
Classification Classify entries as chapter, part, front/back matter, section
Content Extraction Extract content as HTML, plaintext, or markdown
Security Zip bomb detection and zip slip prevention
CLI 5 commands for EPUB inspection from the terminal

Requirements

  • Python 3.10+
  • Dependencies: markdownify, typer

Installation

pip install epubsage

Or with uv:

uv add epubsage

Quick Start

Python

from epubsage import inspect_epub

info = inspect_epub("book.epub")

print(f"Title:   {info.title}")
print(f"Version: EPUB {info.version}")
print(f"TOC:     {info.toc.toc_type}")

Quick Start

Command Line

epubsage inspect book.epub

CLI Inspect

CLI Commands

epubsage --help

CLI Help

Command Description
inspect Display EPUB version, TOC type, and title
extract Extract EPUB contents to a directory
validate Validate ZIP structure, OPF, and TOC
toc Display table of contents as a classified tree
content Extract content for a specific section

Full CLI documentation →

Shell Auto-Completion

epubsage --install-completion

Supported shells: Bash, Zsh, Fish, PowerShell

Python API

Core Functions

Function Description
inspect_epub() One-step inspection: hash, extract, parse OPF, detect TOC
hash_epub() SHA-256 hash of EPUB file
extract_epub() Extract EPUB ZIP to directory
find_opf() Find the .opf file in extracted EPUB
parse_opf() Parse OPF for version and TOC info
parse_ncx() Parse NCX file for navigation map
parse_nav() Parse EPUB 3 NAV document
classify_navpoint() Classify a NavPoint by label
classify_children() Classify children by nesting depth
extract_content() Extract content as HTML, plaintext, or markdown

Data Types

Type Description
EpubInfo Inspection result: version, TOC, OPF path, title
TocInfo TOC detection: type, paths
NavPoint Navigation point: label, file, anchor, children, type
NcxData Parsed NCX: doc title, nav points

Full API documentation →

Examples →

Architecture

epubsage/
├── __init__.py     # Public API: inspect_epub + re-exports
├── classify.py     # classify_navpoint, classify_children
├── cli.py          # CLI: inspect, extract, validate, toc, content
├── content.py      # extract_content (HTML, plaintext, markdown)
├── epub.py         # hash_epub, extract_epub, find_opf
├── exceptions.py   # EpubSageError, InvalidEpubError
├── nav.py          # parse_nav (EPUB 3 NAV)
├── ncx.py          # parse_ncx, NavPoint, NcxData
└── opf.py          # parse_opf, EpubInfo, TocInfo

Development

git clone https://github.com/Abdullah-Wex/epubsage.git
cd epubsage
uv sync
make lint        # ruff
make typecheck   # mypy
make security    # bandit
make test        # pytest
make quality     # all checks

Documentation

Document Description
API Reference Complete Python API documentation
CLI Reference All CLI commands and options
Examples Practical usage examples
Changelog Version history

License

MIT License. See LICENSE for details.

Contributing

Contributions welcome! See CONTRIBUTING.md for guidelines.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

epubsage-0.7.1.tar.gz (31.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

epubsage-0.7.1-py3-none-any.whl (37.7 kB view details)

Uploaded Python 3

File details

Details for the file epubsage-0.7.1.tar.gz.

File metadata

  • Download URL: epubsage-0.7.1.tar.gz
  • Upload date:
  • Size: 31.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for epubsage-0.7.1.tar.gz
Algorithm Hash digest
SHA256 6f9a25b2e7c49240179a6c75d587d98b1c6cc4954a863665d90edfe7bcf61313
MD5 bf15ab143417c46fdf687ccfc300fa77
BLAKE2b-256 bf7d91709050e5057c285a92fd68d676729ca8a106d0a40e1da00f15c2440a46

See more details on using hashes here.

Provenance

The following attestation bundles were made for epubsage-0.7.1.tar.gz:

Publisher: publish.yml on Abdullah-Wex/epubsage

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file epubsage-0.7.1-py3-none-any.whl.

File metadata

  • Download URL: epubsage-0.7.1-py3-none-any.whl
  • Upload date:
  • Size: 37.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for epubsage-0.7.1-py3-none-any.whl
Algorithm Hash digest
SHA256 8d1c4805c2b027e86b494c0fdb6e1ce2ba05442cb958f382d12b6da89a791151
MD5 4458f2d322fe116ef43b741527ba100b
BLAKE2b-256 098902bd8efe2a60fec6fcfae1b99049bfbe105537c993aeee47166eb36b6f53

See more details on using hashes here.

Provenance

The following attestation bundles were made for epubsage-0.7.1-py3-none-any.whl:

Publisher: publish.yml on Abdullah-Wex/epubsage

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page