An awsome epub3 library.
Project description
python-epub3
An awsome epub3 library.
python-epub3 is a Python library for managing ePub 3 books.
WARNING Currently under development, please do not use in production environment.
Installation
Install through github:
pip install git+https://github.com/ChenyangGao/python-epub3
Install through pypi:
pip install python-epub3
Quickstart
Let's say there is a sample.epub
, with the content.opf
file content is
<?xml version="1.0" encoding="UTF-8"?>
<package version="3.3" unique-identifier="pub-id" xmlns="http://www.idpf.org/2007/opf" >
<metadata xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:opf="http://www.idpf.org/2007/opf">
<dc:identifier id="pub-id">urn:uuid:bb4d4afe-f787-4d21-97b8-68f6774ba342</dc:identifier>
<dc:title>ePub</dc:title>
<dc:language>en</dc:language>
<meta property="dcterms:modified">2989-06-04T00:00:00Z</meta>
</metadata>
<manifest>
<item
id="nav"
href="nav.xhtml"
properties="nav"
media-type="application/xhtml+xml"/>
<item
id="intro"
href="intro.xhtml"
media-type="application/xhtml+xml"/>
<item
id="c1"
href="chap1.xhtml"
media-type="application/xhtml+xml"/>
<item
id="c1-answerkey"
href="chap1-answerkey.xhtml"
media-type="application/xhtml+xml"/>
<item
id="c2"
href="chap2.xhtml"
media-type="application/xhtml+xml"/>
<item
id="c2-answerkey"
href="chap2-answerkey.xhtml"
media-type="application/xhtml+xml"/>
<item
id="c3"
href="chap3.xhtml"
media-type="application/xhtml+xml"/>
<item
id="c3-answerkey"
href="chap3-answerkey.xhtml"
media-type="application/xhtml+xml"/>
<item
id="notes"
href="notes.xhtml"
media-type="application/xhtml+xml"/>
<item
id="cover"
href="images/cover.svg"
properties="cover-image"
media-type="image/svg+xml"/>
<item
id="f1"
href="images/fig1.jpg"
media-type="image/jpeg"/>
<item
id="f2"
href="images/fig2.jpg"
media-type="image/jpeg"/>
<item
id="css"
href="style/book.css"
media-type="text/css"/>
</manifest>
<spine
page-progression-direction="ltr">
<itemref
idref="intro"/>
<itemref
idref="c1"/>
<itemref
idref="c1-answerkey"
linear="no"/>
<itemref
idref="c2"/>
<itemref
idref="c2-answerkey"
linear="no"/>
<itemref
idref="c3"/>
<itemref
idref="c3-answerkey"
linear="no"/>
<itemref
idref="notes"
linear="no"/>
</spine>
</package>
Import the python-epub3
module
>>> from epub3 import ePub
Create an e-book, which can take an actual existing e-book path as argument
>>> book = ePub("sample.epub")
>>> book
<ePub(<{http://www.idpf.org/2007/opf}package>, attrib={'version': '3.0', 'unique-identifier': 'BookId'}) at 0x102a93810>
View metadata
>>> book.metadata
<Metadata(<{http://www.idpf.org/2007/opf}metadata>) at 0x1035c3c50>
[<DCTerm(<{http://purl.org/dc/elements/1.1/}identifier>, attrib={'id': 'BookId'}, text='urn:uuid:bb4d4afe-f787-4d21-97b8-68f6774ba342') at 0x1031ea6d0>,
<DCTerm(<{http://purl.org/dc/elements/1.1/}language>, text='en') at 0x1035e4710>,
<DCTerm(<{http://purl.org/dc/elements/1.1/}title>, text='ePub') at 0x1035a00d0>,
<Meta(<{http://www.idpf.org/2007/opf}meta>, attrib={'property': 'dcterms:modified'}, text='2989-06-04T00:00:00Z') at 0x1035a0850>]
View the identifier, i.e. dc:identifier
>>> identifier = book.identifier
>>> identifier
'urn:uuid:bb4d4afe-f787-4d21-97b8-68f6774ba342'
>>> isinstance(identifier, str)
True
View and modify the title, i.e. dc:title
>>> title = book.title
>>> title
'ePub'
>>> book.title = "my first book"
>>> title
'my first book'
View and modify the language, i.e. dc:language
>>> language = book.language
>>> language
'en'
>>> book.language = "en-US"
>>> language
'en-US'
View and update the modification time 😂
>>> book.modified
'2989-06-04T00:00:00Z'
>>> e.mark_modified()
'3000-01-01T00:00:00Z'
View metadata again
>>> book.metadata
<Metadata(<{http://www.idpf.org/2007/opf}metadata>) at 0x1075cdfd0>
[<DCTerm(<{http://purl.org/dc/elements/1.1/}identifier>, attrib={'id': 'BookId'}, text='urn:uuid:bb4d4afe-f787-4d21-97b8-68f6774ba342') at 0x10750c350>,
<DCTerm(<{http://purl.org/dc/elements/1.1/}language>, text='en') at 0x10a6835d0>,
<DCTerm(<{http://purl.org/dc/elements/1.1/}title>, text='ePub') at 0x10a682550>,
<Meta(<{http://www.idpf.org/2007/opf}meta>, attrib={'property': 'dcterms:modified'}, text='3000-01-01T00:00:00Z') at 0x10a77f6d0>]
View manifest
>>> book.manifest
{'nav': <Item(<{http://www.idpf.org/2007/opf}item>, attrib={'id': 'nav', 'href': 'nav.xhtml', 'properties': 'nav', 'media-type': 'application/xhtml+xml'}) at 0x1073e1e10>,
'intro': <Item(<{http://www.idpf.org/2007/opf}item>, attrib={'id': 'intro', 'href': 'intro.xhtml', 'media-type': 'application/xhtml+xml'}) at 0x1073e2190>,
'c1': <Item(<{http://www.idpf.org/2007/opf}item>, attrib={'id': 'c1', 'href': 'chap1.xhtml', 'media-type': 'application/xhtml+xml'}) at 0x1073e25d0>,
'c1-answerkey': <Item(<{http://www.idpf.org/2007/opf}item>, attrib={'id': 'c1-answerkey', 'href': 'chap1-answerkey.xhtml', 'media-type': 'application/xhtml+xml'}) at 0x1073e2990>,
'c2': <Item(<{http://www.idpf.org/2007/opf}item>, attrib={'id': 'c2', 'href': 'chap2.xhtml', 'media-type': 'application/xhtml+xml'}) at 0x1073e3350>,
'c2-answerkey': <Item(<{http://www.idpf.org/2007/opf}item>, attrib={'id': 'c2-answerkey', 'href': 'chap2-answerkey.xhtml', 'media-type': 'application/xhtml+xml'}) at 0x1075aded0>,
'c3': <Item(<{http://www.idpf.org/2007/opf}item>, attrib={'id': 'c3', 'href': 'chap3.xhtml', 'media-type': 'application/xhtml+xml'}) at 0x1075af950>,
'c3-answerkey': <Item(<{http://www.idpf.org/2007/opf}item>, attrib={'id': 'c3-answerkey', 'href': 'chap3-answerkey.xhtml', 'media-type': 'application/xhtml+xml'}) at 0x1075ae710>,
'notes': <Item(<{http://www.idpf.org/2007/opf}item>, attrib={'id': 'notes', 'href': 'notes.xhtml', 'media-type': 'application/xhtml+xml'}) at 0x1075ae3d0>,
'cover': <Item(<{http://www.idpf.org/2007/opf}item>, attrib={'id': 'cover', 'href': 'images/cover.svg', 'properties': 'cover-image', 'media-type': 'image/svg+xml'}) at 0x1075ae610>,
'f1': <Item(<{http://www.idpf.org/2007/opf}item>, attrib={'id': 'f1', 'href': 'images/fig1.jpg', 'media-type': 'image/jpeg'}) at 0x109a39950>,
'f2': <Item(<{http://www.idpf.org/2007/opf}item>, attrib={'id': 'f2', 'href': 'images/fig2.jpg', 'media-type': 'image/jpeg'}) at 0x107534310>,
'css': <Item(<{http://www.idpf.org/2007/opf}item>, attrib={'id': 'css', 'href': 'style/book.css', 'media-type': 'text/css'}) at 0x107534290>}
>>> book.manifest.list()
[<Item(<{http://www.idpf.org/2007/opf}item>, attrib={'id': 'nav', 'href': 'nav.xhtml', 'properties': 'nav', 'media-type': 'application/xhtml+xml'}) at 0x1073e1e10>,
<Item(<{http://www.idpf.org/2007/opf}item>, attrib={'id': 'intro', 'href': 'intro.xhtml', 'media-type': 'application/xhtml+xml'}) at 0x1073e2190>,
<Item(<{http://www.idpf.org/2007/opf}item>, attrib={'id': 'c1', 'href': 'chap1.xhtml', 'media-type': 'application/xhtml+xml'}) at 0x1073e25d0>,
<Item(<{http://www.idpf.org/2007/opf}item>, attrib={'id': 'c1-answerkey', 'href': 'chap1-answerkey.xhtml', 'media-type': 'application/xhtml+xml'}) at 0x1073e2990>,
<Item(<{http://www.idpf.org/2007/opf}item>, attrib={'id': 'c2', 'href': 'chap2.xhtml', 'media-type': 'application/xhtml+xml'}) at 0x1073e3350>,
<Item(<{http://www.idpf.org/2007/opf}item>, attrib={'id': 'c2-answerkey', 'href': 'chap2-answerkey.xhtml', 'media-type': 'application/xhtml+xml'}) at 0x1075aded0>,
<Item(<{http://www.idpf.org/2007/opf}item>, attrib={'id': 'c3', 'href': 'chap3.xhtml', 'media-type': 'application/xhtml+xml'}) at 0x1075af950>,
<Item(<{http://www.idpf.org/2007/opf}item>, attrib={'id': 'c3-answerkey', 'href': 'chap3-answerkey.xhtml', 'media-type': 'application/xhtml+xml'}) at 0x1075ae710>,
<Item(<{http://www.idpf.org/2007/opf}item>, attrib={'id': 'notes', 'href': 'notes.xhtml', 'media-type': 'application/xhtml+xml'}) at 0x1075ae3d0>,
<Item(<{http://www.idpf.org/2007/opf}item>, attrib={'id': 'cover', 'href': 'images/cover.svg', 'properties': 'cover-image', 'media-type': 'image/svg+xml'}) at 0x1075ae610>,
<Item(<{http://www.idpf.org/2007/opf}item>, attrib={'id': 'f1', 'href': 'images/fig1.jpg', 'media-type': 'image/jpeg'}) at 0x109a39950>,
<Item(<{http://www.idpf.org/2007/opf}item>, attrib={'id': 'f2', 'href': 'images/fig2.jpg', 'media-type': 'image/jpeg'}) at 0x107534310>,
<Item(<{http://www.idpf.org/2007/opf}item>, attrib={'id': 'css', 'href': 'style/book.css', 'media-type': 'text/css'}) at 0x107534290>]
Get an item
>>> book.manifest[0]
<Item(<{http://www.idpf.org/2007/opf}item>, attrib={'id': 'nav', 'href': 'nav.xhtml', 'properties': 'nav', 'media-type': 'application/xhtml+xml'}) at 0x1073e1e10>
>>>book.manifest['nav']
<Item(<{http://www.idpf.org/2007/opf}item>, attrib={'id': 'nav', 'href': 'nav.xhtml', 'properties': 'nav', 'media-type': 'application/xhtml+xml'}) at 0x1073e1e10>
>>> book.manifest('nav.xhtml')
<Item(<{http://www.idpf.org/2007/opf}item>, attrib={'id': 'nav', 'href': 'nav.xhtml', 'properties': 'nav', 'media-type': 'application/xhtml+xml'}) at 0x1073e1e10>
View spine
>>> book.spine
{'intro': <Itemref(<{http://www.idpf.org/2007/opf}itemref>, attrib={'idref': 'intro'}) at 0x107533c90>,
'c1': <Itemref(<{http://www.idpf.org/2007/opf}itemref>, attrib={'idref': 'c1'}) at 0x109a88ed0>,
'c1-answerkey': <Itemref(<{http://www.idpf.org/2007/opf}itemref>, attrib={'idref': 'c1-answerkey'}) at 0x109a88f50>,
'c2': <Itemref(<{http://www.idpf.org/2007/opf}itemref>, attrib={'idref': 'c2'}) at 0x109a89110>,
'c2-answerkey': <Itemref(<{http://www.idpf.org/2007/opf}itemref>, attrib={'idref': 'c2-answerkey'}) at 0x109a891d0>,
'c3': <Itemref(<{http://www.idpf.org/2007/opf}itemref>, attrib={'idref': 'c3'}) at 0x109a89290>,
'c3-answerkey': <Itemref(<{http://www.idpf.org/2007/opf}itemref>, attrib={'idref': 'c3-answerkey'}) at 0x109a89350>,
'notes': <Itemref(<{http://www.idpf.org/2007/opf}itemref>, attrib={'idref': 'notes'}) at 0x109a893d0>}
>>> book.spine.list()
[<Itemref(<{http://www.idpf.org/2007/opf}itemref>, attrib={'idref': 'intro'}) at 0x107533c90>,
<Itemref(<{http://www.idpf.org/2007/opf}itemref>, attrib={'idref': 'c1'}) at 0x109a88ed0>,
<Itemref(<{http://www.idpf.org/2007/opf}itemref>, attrib={'idref': 'c1-answerkey'}) at 0x109a88f50>,
<Itemref(<{http://www.idpf.org/2007/opf}itemref>, attrib={'idref': 'c2'}) at 0x109a89110>,
<Itemref(<{http://www.idpf.org/2007/opf}itemref>, attrib={'idref': 'c2-answerkey'}) at 0x109a891d0>,
<Itemref(<{http://www.idpf.org/2007/opf}itemref>, attrib={'idref': 'c3'}) at 0x109a89290>,
<Itemref(<{http://www.idpf.org/2007/opf}itemref>, attrib={'idref': 'c3-answerkey'}) at 0x109a89350>,
<Itemref(<{http://www.idpf.org/2007/opf}itemref>, attrib={'idref': 'notes'}) at 0x109a893d0>]
Get an itemref
>>> book.spine[0]
<Itemref(<{http://www.idpf.org/2007/opf}itemref>, attrib={'idref': 'intro'}) at 0x107533c90>
>>>book.manifest['intro']
<Itemref(<{http://www.idpf.org/2007/opf}itemref>, attrib={'idref': 'intro'}) at 0x107533c90>
Add a file
>>> item = book.manifest.add("chapter0001.xhtml", id="chapter0001")
>>> item
<Item(<{http://www.idpf.org/2007/opf}item>, attrib={'id': 'chapter0001', 'href': 'chapter0001.xhtml', 'media-type': 'application/xhtml+xml'}) at 0x1079bb190>
Open and write some textual data to it
>>> file = item.open("w")
>>> file
<_io.TextIOWrapper name='/var/folders/k1/3r19jl7d30n834vdmbz9ygh80000gn/T/tmpzubn_x2f/69bccdc4-50b5-404a-8117-33fe47648f3a' encoding='utf-8'>
>>> file.write('''<?xml version="1.0" encoding="utf-8"?><!DOCTYPE html>
... <html xmlns="http://www.w3.org/1999/xhtml" xmlns:epub="http://www.idpf.org/2007/ops">
... <head>
... <title></title>
... </head>
... <body>
... <p> </p>
... </body>
... </html>''')
211
>>> file.close()
Read it again
>>> print(item.read_text())
<?xml version="1.0" encoding="utf-8"?><!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:epub="http://www.idpf.org/2007/ops">
<head>
<title></title>
</head>
<body>
<p> </p>
</body>
</html>
Add the item to spine
>>> book.spine.add(item)
<Itemref(<{http://www.idpf.org/2007/opf}itemref>, attrib={'idref': 'chapter0001'}) at 0x1133e4510>
Add an external file
>>> item = book.manifest.add("features.js", "js/features.js")
>>> item
<Item(<{http://www.idpf.org/2007/opf}item>, attrib={'id': 'c8d322e0-a960-44ea-bf15-66d1dbbce15d', 'href': 'features.js', 'media-type': 'text/javascript'}) at 0x1038db390>
Add a dc:creator
metadata
>>> book.metadata.add("dc:creator", dict(id="creator"), text="ChenyangGao")
<DCTerm(<{http://purl.org/dc/elements/1.1/}creator>, attrib={'id': 'creator'}, text='ChenyangGao') at 0x103ced950>
Add a <meta>
metadata
>>> book.metadata.add("meta", dict(refines="#creator", property="role", scheme="marc:relators", id="role"), text="author")
<Meta(<{http://www.idpf.org/2007/opf}meta>, attrib={'refines': '#creator', 'property': 'role', 'scheme': 'marc:relators', 'id': 'role'}, text='author') at 0x105128a50>
Find metadata
>>> book.metadata.find("dc:creator")
<DCTerm(<{http://purl.org/dc/elements/1.1/}creator>, attrib={'id': 'creator'}, text='ChenyangGao') at 0x103ced950>
>>> book.metadata.dc("creator")
<DCTerm(<{http://purl.org/dc/elements/1.1/}creator>, attrib={'id': 'creator'}, text='ChenyangGao') at 0x103ced950>
>>> book.metadata.meta('[@property="role"]')
<Meta(<{http://www.idpf.org/2007/opf}meta>, attrib={'refines': '#creator', 'property': 'role', 'scheme': 'marc:relators', 'id': 'role'}, text='author') at 0x105128a50>
>>> book.metadata.property_meta("role")
<Meta(<{http://www.idpf.org/2007/opf}meta>, attrib={'refines': '#creator', 'property': 'role', 'scheme': 'marc:relators', 'id': 'role'}, text='author') at 0x105128a50>
Pack the book
>>> book.pack("book_i_made.epub")
View tutorial for more details.
Features
-
Proxy underlying XML element nodes to operate on OPF document.
-
Support querying nodes using ElementPath.
-
Manifest supports file system interfaces, referenced os.path, shutil, pathlib.Path.
-
Numerous lazy loading features, just like Occam's razor.
Entities should not be multiplied unnecessarily.
-- Occam's razorWe are to admit no more causes of natural things than such as are both true and sufficient to explain their appearances.
-- Isaac NewtonEverything should be made as simple as possible, but no simpler.
-- Albert Einstein -
Caching instance, not created repeatedly, and recycled in a timely manner.
-
Allow adding any openable files, as long as there is an open method and its parameters are compatible with open.
-
Stream processing, supporting various operators such as map, reduce, filter, etc.
-
Various proxies and bindings fully realize multiple ways to achieve the same operational objective.
Documentation
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for python_epub3-0.0.1.15-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1b786e6fd555410d2acdbacd21ce0c85ab178ad781752cb5574307d2ccbb0e93 |
|
MD5 | 547ca4966dc110ac2c23d3fa8146cb57 |
|
BLAKE2b-256 | 513a087088fed2fd8b25a4f34ec6032dafb0464217f688c6795337e74f797fed |