An awsome epub3 library.
Project description
python-epub3
An awsome epub3 library.
python-epub3 is a Python library for managing ePub 3 books.
WARNING Currently under development, please do not use in production environment.
Installation
Install through github:
pip install git+https://github.com/ChenyangGao/python-epub3
Install through pypi:
pip install python-epub3
Quickstart
Let's say there is a sample.epub
, with the content.opf
file content is
<?xml version="1.0" encoding="UTF-8"?>
<package version="3.3" unique-identifier="pub-id" xmlns="http://www.idpf.org/2007/opf" >
<metadata xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:opf="http://www.idpf.org/2007/opf">
<dc:identifier id="pub-id">urn:uuid:bb4d4afe-f787-4d21-97b8-68f6774ba342</dc:identifier>
<dc:title>ePub</dc:title>
<dc:language>en</dc:language>
<meta property="dcterms:modified">2989-06-04T00:00:00Z</meta>
</metadata>
<manifest>
<item
id="nav"
href="nav.xhtml"
properties="nav"
media-type="application/xhtml+xml"/>
<item
id="intro"
href="intro.xhtml"
media-type="application/xhtml+xml"/>
<item
id="c1"
href="chap1.xhtml"
media-type="application/xhtml+xml"/>
<item
id="c1-answerkey"
href="chap1-answerkey.xhtml"
media-type="application/xhtml+xml"/>
<item
id="c2"
href="chap2.xhtml"
media-type="application/xhtml+xml"/>
<item
id="c2-answerkey"
href="chap2-answerkey.xhtml"
media-type="application/xhtml+xml"/>
<item
id="c3"
href="chap3.xhtml"
media-type="application/xhtml+xml"/>
<item
id="c3-answerkey"
href="chap3-answerkey.xhtml"
media-type="application/xhtml+xml"/>
<item
id="notes"
href="notes.xhtml"
media-type="application/xhtml+xml"/>
<item
id="cover"
href="images/cover.svg"
properties="cover-image"
media-type="image/svg+xml"/>
<item
id="f1"
href="images/fig1.jpg"
media-type="image/jpeg"/>
<item
id="f2"
href="images/fig2.jpg"
media-type="image/jpeg"/>
<item
id="css"
href="style/book.css"
media-type="text/css"/>
</manifest>
<spine
page-progression-direction="ltr">
<itemref
idref="intro"/>
<itemref
idref="c1"/>
<itemref
idref="c1-answerkey"
linear="no"/>
<itemref
idref="c2"/>
<itemref
idref="c2-answerkey"
linear="no"/>
<itemref
idref="c3"/>
<itemref
idref="c3-answerkey"
linear="no"/>
<itemref
idref="notes"
linear="no"/>
</spine>
</package>
Import the python-epub3
module
>>> from epub3 import ePub
Create an e-book, which can take an actual existing e-book path as argument
>>> book = ePub("sample.epub")
>>> book
<ePub(<{http://www.idpf.org/2007/opf}package>, attrib={'version': '3.0', 'unique-identifier': 'BookId'}) at 0x102a93810>
View metadata
>>> book.metadata
<Metadata(<{http://www.idpf.org/2007/opf}metadata>) at 0x1035c3c50>
[<DCTerm(<{http://purl.org/dc/elements/1.1/}identifier>, attrib={'id': 'BookId'}, text='urn:uuid:bb4d4afe-f787-4d21-97b8-68f6774ba342') at 0x1031ea6d0>,
<DCTerm(<{http://purl.org/dc/elements/1.1/}language>, text='en') at 0x1035e4710>,
<DCTerm(<{http://purl.org/dc/elements/1.1/}title>, text='ePub') at 0x1035a00d0>,
<Meta(<{http://www.idpf.org/2007/opf}meta>, attrib={'property': 'dcterms:modified'}, text='2989-06-04T00:00:00Z') at 0x1035a0850>]
View the identifier, i.e. dc:identifier
>>> identifier = book.identifier
>>> identifier
'urn:uuid:bb4d4afe-f787-4d21-97b8-68f6774ba342'
>>> isinstance(identifier, str)
True
View and modify the title, i.e. dc:title
>>> title = book.title
>>> title
'ePub'
>>> book.title = "my first book"
>>> title
'my first book'
View and modify the language, i.e. dc:language
>>> language = book.language
>>> language
'en'
>>> book.language = "en-US"
>>> language
'en-US'
View and update the modification time 😂
>>> book.modified
'2989-06-04T00:00:00Z'
>>> e.mark_modified()
'3000-01-01T00:00:00Z'
View metadata again
>>> book.metadata
<Metadata(<{http://www.idpf.org/2007/opf}metadata>) at 0x1075cdfd0>
[<DCTerm(<{http://purl.org/dc/elements/1.1/}identifier>, attrib={'id': 'BookId'}, text='urn:uuid:bb4d4afe-f787-4d21-97b8-68f6774ba342') at 0x10750c350>,
<DCTerm(<{http://purl.org/dc/elements/1.1/}language>, text='en') at 0x10a6835d0>,
<DCTerm(<{http://purl.org/dc/elements/1.1/}title>, text='ePub') at 0x10a682550>,
<Meta(<{http://www.idpf.org/2007/opf}meta>, attrib={'property': 'dcterms:modified'}, text='3000-01-01T00:00:00Z') at 0x10a77f6d0>]
View manifest
>>> book.manifest
{'nav': <Item(<{http://www.idpf.org/2007/opf}item>, attrib={'id': 'nav', 'href': 'nav.xhtml', 'properties': 'nav', 'media-type': 'application/xhtml+xml'}) at 0x1073e1e10>,
'intro': <Item(<{http://www.idpf.org/2007/opf}item>, attrib={'id': 'intro', 'href': 'intro.xhtml', 'media-type': 'application/xhtml+xml'}) at 0x1073e2190>,
'c1': <Item(<{http://www.idpf.org/2007/opf}item>, attrib={'id': 'c1', 'href': 'chap1.xhtml', 'media-type': 'application/xhtml+xml'}) at 0x1073e25d0>,
'c1-answerkey': <Item(<{http://www.idpf.org/2007/opf}item>, attrib={'id': 'c1-answerkey', 'href': 'chap1-answerkey.xhtml', 'media-type': 'application/xhtml+xml'}) at 0x1073e2990>,
'c2': <Item(<{http://www.idpf.org/2007/opf}item>, attrib={'id': 'c2', 'href': 'chap2.xhtml', 'media-type': 'application/xhtml+xml'}) at 0x1073e3350>,
'c2-answerkey': <Item(<{http://www.idpf.org/2007/opf}item>, attrib={'id': 'c2-answerkey', 'href': 'chap2-answerkey.xhtml', 'media-type': 'application/xhtml+xml'}) at 0x1075aded0>,
'c3': <Item(<{http://www.idpf.org/2007/opf}item>, attrib={'id': 'c3', 'href': 'chap3.xhtml', 'media-type': 'application/xhtml+xml'}) at 0x1075af950>,
'c3-answerkey': <Item(<{http://www.idpf.org/2007/opf}item>, attrib={'id': 'c3-answerkey', 'href': 'chap3-answerkey.xhtml', 'media-type': 'application/xhtml+xml'}) at 0x1075ae710>,
'notes': <Item(<{http://www.idpf.org/2007/opf}item>, attrib={'id': 'notes', 'href': 'notes.xhtml', 'media-type': 'application/xhtml+xml'}) at 0x1075ae3d0>,
'cover': <Item(<{http://www.idpf.org/2007/opf}item>, attrib={'id': 'cover', 'href': 'images/cover.svg', 'properties': 'cover-image', 'media-type': 'image/svg+xml'}) at 0x1075ae610>,
'f1': <Item(<{http://www.idpf.org/2007/opf}item>, attrib={'id': 'f1', 'href': 'images/fig1.jpg', 'media-type': 'image/jpeg'}) at 0x109a39950>,
'f2': <Item(<{http://www.idpf.org/2007/opf}item>, attrib={'id': 'f2', 'href': 'images/fig2.jpg', 'media-type': 'image/jpeg'}) at 0x107534310>,
'css': <Item(<{http://www.idpf.org/2007/opf}item>, attrib={'id': 'css', 'href': 'style/book.css', 'media-type': 'text/css'}) at 0x107534290>}
>>> book.manifest.list()
[<Item(<{http://www.idpf.org/2007/opf}item>, attrib={'id': 'nav', 'href': 'nav.xhtml', 'properties': 'nav', 'media-type': 'application/xhtml+xml'}) at 0x1073e1e10>,
<Item(<{http://www.idpf.org/2007/opf}item>, attrib={'id': 'intro', 'href': 'intro.xhtml', 'media-type': 'application/xhtml+xml'}) at 0x1073e2190>,
<Item(<{http://www.idpf.org/2007/opf}item>, attrib={'id': 'c1', 'href': 'chap1.xhtml', 'media-type': 'application/xhtml+xml'}) at 0x1073e25d0>,
<Item(<{http://www.idpf.org/2007/opf}item>, attrib={'id': 'c1-answerkey', 'href': 'chap1-answerkey.xhtml', 'media-type': 'application/xhtml+xml'}) at 0x1073e2990>,
<Item(<{http://www.idpf.org/2007/opf}item>, attrib={'id': 'c2', 'href': 'chap2.xhtml', 'media-type': 'application/xhtml+xml'}) at 0x1073e3350>,
<Item(<{http://www.idpf.org/2007/opf}item>, attrib={'id': 'c2-answerkey', 'href': 'chap2-answerkey.xhtml', 'media-type': 'application/xhtml+xml'}) at 0x1075aded0>,
<Item(<{http://www.idpf.org/2007/opf}item>, attrib={'id': 'c3', 'href': 'chap3.xhtml', 'media-type': 'application/xhtml+xml'}) at 0x1075af950>,
<Item(<{http://www.idpf.org/2007/opf}item>, attrib={'id': 'c3-answerkey', 'href': 'chap3-answerkey.xhtml', 'media-type': 'application/xhtml+xml'}) at 0x1075ae710>,
<Item(<{http://www.idpf.org/2007/opf}item>, attrib={'id': 'notes', 'href': 'notes.xhtml', 'media-type': 'application/xhtml+xml'}) at 0x1075ae3d0>,
<Item(<{http://www.idpf.org/2007/opf}item>, attrib={'id': 'cover', 'href': 'images/cover.svg', 'properties': 'cover-image', 'media-type': 'image/svg+xml'}) at 0x1075ae610>,
<Item(<{http://www.idpf.org/2007/opf}item>, attrib={'id': 'f1', 'href': 'images/fig1.jpg', 'media-type': 'image/jpeg'}) at 0x109a39950>,
<Item(<{http://www.idpf.org/2007/opf}item>, attrib={'id': 'f2', 'href': 'images/fig2.jpg', 'media-type': 'image/jpeg'}) at 0x107534310>,
<Item(<{http://www.idpf.org/2007/opf}item>, attrib={'id': 'css', 'href': 'style/book.css', 'media-type': 'text/css'}) at 0x107534290>]
Get an item
>>> book.manifest[0]
<Item(<{http://www.idpf.org/2007/opf}item>, attrib={'id': 'nav', 'href': 'nav.xhtml', 'properties': 'nav', 'media-type': 'application/xhtml+xml'}) at 0x1073e1e10>
>>>book.manifest['nav']
<Item(<{http://www.idpf.org/2007/opf}item>, attrib={'id': 'nav', 'href': 'nav.xhtml', 'properties': 'nav', 'media-type': 'application/xhtml+xml'}) at 0x1073e1e10>
>>> book.manifest('nav.xhtml')
<Item(<{http://www.idpf.org/2007/opf}item>, attrib={'id': 'nav', 'href': 'nav.xhtml', 'properties': 'nav', 'media-type': 'application/xhtml+xml'}) at 0x1073e1e10>
View spine
>>> book.spine
{'intro': <Itemref(<{http://www.idpf.org/2007/opf}itemref>, attrib={'idref': 'intro'}) at 0x107533c90>,
'c1': <Itemref(<{http://www.idpf.org/2007/opf}itemref>, attrib={'idref': 'c1'}) at 0x109a88ed0>,
'c1-answerkey': <Itemref(<{http://www.idpf.org/2007/opf}itemref>, attrib={'idref': 'c1-answerkey'}) at 0x109a88f50>,
'c2': <Itemref(<{http://www.idpf.org/2007/opf}itemref>, attrib={'idref': 'c2'}) at 0x109a89110>,
'c2-answerkey': <Itemref(<{http://www.idpf.org/2007/opf}itemref>, attrib={'idref': 'c2-answerkey'}) at 0x109a891d0>,
'c3': <Itemref(<{http://www.idpf.org/2007/opf}itemref>, attrib={'idref': 'c3'}) at 0x109a89290>,
'c3-answerkey': <Itemref(<{http://www.idpf.org/2007/opf}itemref>, attrib={'idref': 'c3-answerkey'}) at 0x109a89350>,
'notes': <Itemref(<{http://www.idpf.org/2007/opf}itemref>, attrib={'idref': 'notes'}) at 0x109a893d0>}
>>> book.spine.list()
[<Itemref(<{http://www.idpf.org/2007/opf}itemref>, attrib={'idref': 'intro'}) at 0x107533c90>,
<Itemref(<{http://www.idpf.org/2007/opf}itemref>, attrib={'idref': 'c1'}) at 0x109a88ed0>,
<Itemref(<{http://www.idpf.org/2007/opf}itemref>, attrib={'idref': 'c1-answerkey'}) at 0x109a88f50>,
<Itemref(<{http://www.idpf.org/2007/opf}itemref>, attrib={'idref': 'c2'}) at 0x109a89110>,
<Itemref(<{http://www.idpf.org/2007/opf}itemref>, attrib={'idref': 'c2-answerkey'}) at 0x109a891d0>,
<Itemref(<{http://www.idpf.org/2007/opf}itemref>, attrib={'idref': 'c3'}) at 0x109a89290>,
<Itemref(<{http://www.idpf.org/2007/opf}itemref>, attrib={'idref': 'c3-answerkey'}) at 0x109a89350>,
<Itemref(<{http://www.idpf.org/2007/opf}itemref>, attrib={'idref': 'notes'}) at 0x109a893d0>]
Get an itemref
>>> book.spine[0]
<Itemref(<{http://www.idpf.org/2007/opf}itemref>, attrib={'idref': 'intro'}) at 0x107533c90>
>>>book.manifest['intro']
<Itemref(<{http://www.idpf.org/2007/opf}itemref>, attrib={'idref': 'intro'}) at 0x107533c90>
Add a file
>>> item = book.manifest.add("chapter0001.xhtml", id="chapter0001")
>>> item
<Item(<{http://www.idpf.org/2007/opf}item>, attrib={'id': 'chapter0001', 'href': 'chapter0001.xhtml', 'media-type': 'application/xhtml+xml'}) at 0x1079bb190>
Open and write some textual data to it
>>> file = item.open("w")
>>> file
<_io.TextIOWrapper name='/var/folders/k1/3r19jl7d30n834vdmbz9ygh80000gn/T/tmpzubn_x2f/69bccdc4-50b5-404a-8117-33fe47648f3a' encoding='utf-8'>
>>> file.write('''<?xml version="1.0" encoding="utf-8"?><!DOCTYPE html>
... <html xmlns="http://www.w3.org/1999/xhtml" xmlns:epub="http://www.idpf.org/2007/ops">
... <head>
... <title></title>
... </head>
... <body>
... <p> </p>
... </body>
... </html>''')
211
>>> file.close()
Read it again
>>> print(item.read_text())
<?xml version="1.0" encoding="utf-8"?><!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:epub="http://www.idpf.org/2007/ops">
<head>
<title></title>
</head>
<body>
<p> </p>
</body>
</html>
Add the item to spine
>>> book.spine.add(item)
<Itemref(<{http://www.idpf.org/2007/opf}itemref>, attrib={'idref': 'chapter0001'}) at 0x1133e4510>
Add an external file
>>> item = book.manifest.add("features.js", "js/features.js")
>>> item
<Item(<{http://www.idpf.org/2007/opf}item>, attrib={'id': 'c8d322e0-a960-44ea-bf15-66d1dbbce15d', 'href': 'features.js', 'media-type': 'text/javascript'}) at 0x1038db390>
Add a dc:creator
metadata
>>> book.metadata.add("dc:creator", dict(id="creator"), text="ChenyangGao")
<DCTerm(<{http://purl.org/dc/elements/1.1/}creator>, attrib={'id': 'creator'}, text='ChenyangGao') at 0x103ced950>
Add a <meta>
metadata
>>> book.metadata.add("meta", dict(refines="#creator", property="role", scheme="marc:relators", id="role"), text="author")
<Meta(<{http://www.idpf.org/2007/opf}meta>, attrib={'refines': '#creator', 'property': 'role', 'scheme': 'marc:relators', 'id': 'role'}, text='author') at 0x105128a50>
Find metadata
>>> book.metadata.find("dc:creator")
<DCTerm(<{http://purl.org/dc/elements/1.1/}creator>, attrib={'id': 'creator'}, text='ChenyangGao') at 0x103ced950>
>>> book.metadata.dc("creator")
<DCTerm(<{http://purl.org/dc/elements/1.1/}creator>, attrib={'id': 'creator'}, text='ChenyangGao') at 0x103ced950>
>>> book.metadata.meta('[@property="role"]')
<Meta(<{http://www.idpf.org/2007/opf}meta>, attrib={'refines': '#creator', 'property': 'role', 'scheme': 'marc:relators', 'id': 'role'}, text='author') at 0x105128a50>
>>> book.metadata.property_meta("role")
<Meta(<{http://www.idpf.org/2007/opf}meta>, attrib={'refines': '#creator', 'property': 'role', 'scheme': 'marc:relators', 'id': 'role'}, text='author') at 0x105128a50>
Pack the book
>>> book.pack("book_i_made.epub")
View tutorial for more details.
Features
-
Proxy underlying XML element nodes to operate on OPF document.
-
Support querying nodes using ElementPath.
-
Manifest supports file system interfaces, referenced os.path, shutil, pathlib.Path.
-
Numerous lazy loading features, just like Occam's razor.
Entities should not be multiplied unnecessarily.
-- Occam's razorWe are to admit no more causes of natural things than such as are both true and sufficient to explain their appearances.
-- Isaac NewtonEverything should be made as simple as possible, but no simpler.
-- Albert Einstein -
Caching instance, not created repeatedly, and recycled in a timely manner.
-
Allow adding any openable files, as long as there is an open method and its parameters are compatible with open.
-
Stream processing, supporting various operators such as map, reduce, filter, etc.
-
Various proxies and bindings fully realize multiple ways to achieve the same operational objective.
Documentation
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file python_epub3-0.0.1.15.tar.gz
.
File metadata
- Download URL: python_epub3-0.0.1.15.tar.gz
- Upload date:
- Size: 47.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.7.0 CPython/3.11.4 Darwin/21.6.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 409ec7543864f90e166dba0de6ebfd6ef0882af31040056a63166c8b6e0fbffb |
|
MD5 | 508ca890373dd9b92435d5f57d649ef9 |
|
BLAKE2b-256 | 37e5ebcff11f2b42c011bd47d5d7ead314d04b4b06c6e0d7aadb2c467757063b |
File details
Details for the file python_epub3-0.0.1.15-py3-none-any.whl
.
File metadata
- Download URL: python_epub3-0.0.1.15-py3-none-any.whl
- Upload date:
- Size: 50.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.7.0 CPython/3.11.4 Darwin/21.6.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1b786e6fd555410d2acdbacd21ce0c85ab178ad781752cb5574307d2ccbb0e93 |
|
MD5 | 547ca4966dc110ac2c23d3fa8146cb57 |
|
BLAKE2b-256 | 513a087088fed2fd8b25a4f34ec6032dafb0464217f688c6795337e74f797fed |