Parse an ebook based on its TOC into a tree-like structure
Project description
ebook-tree-parser
use ebooklib to parse a tree-like structure from ebooks from the TOC
Usage
from ebooklib import epub
from ebook_tree_parser.toctree import TocTree
file = "../data/frankenstein.epub"
book = epub.read_epub(file, options={'ignore_ncx': False})
estimator = lambda string: len(string)*4
tree = TocTree(book, token_estimator=estimator)
print(tree)
for node in tree3:
print("----")
print(f"{node.title}|{node.content_token_count}\n{node.content[:50]}")
print("----")
Development
- Create a virtual environment
- pip install -e .
- Make sure to update pyproject.toml with the correct dependencies
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Close
Hashes for ebook_tree_parser-0.1.2-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 26643dd1443d34a927c9bfd503a15401afc2ee38f40379ed49fab334339ad970 |
|
MD5 | f71273994873889742b22b742bbea3a8 |
|
BLAKE2b-256 | 4cdf21e4c5a0a31975fc09013443d5fb2892fe8581dd4c54ddec4bd097e9afab |