Skip to main content

An awsome epub3 library.

Project description

python-epub3

An awsome epub3 library.

PyPI - Python Version PyPI - Version PyPI - Downloads PyPI - Format PyPI - Status

GitHub GitHub all releases GitHub language count GitHub issues Codecov

python-epub3 is a Python library for managing ePub 3 books.

WARNING Currently under development, please do not use in production environment.

Installation

Install through github:

pip install git+https://github.com/ChenyangGao/python-epub3

Install through pypi:

pip install python-epub3

Quickstart

Let's say there is a sample.epub, with the content.opf file content is

<?xml version="1.0" encoding="UTF-8"?>
<package version="3.3" unique-identifier="pub-id" xmlns="http://www.idpf.org/2007/opf" >
    <metadata xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:opf="http://www.idpf.org/2007/opf">
       <dc:identifier id="pub-id">urn:uuid:bb4d4afe-f787-4d21-97b8-68f6774ba342</dc:identifier>
       <dc:title>ePub</dc:title>
       <dc:language>en</dc:language>
       <meta property="dcterms:modified">2989-06-04T00:00:00Z</meta>
    </metadata>
   <manifest>
      <item
          id="nav"
          href="nav.xhtml"
          properties="nav"
          media-type="application/xhtml+xml"/>
      <item
          id="intro"
          href="intro.xhtml"
          media-type="application/xhtml+xml"/>
      <item
          id="c1"
          href="chap1.xhtml"
          media-type="application/xhtml+xml"/>
      <item
          id="c1-answerkey"
          href="chap1-answerkey.xhtml"
          media-type="application/xhtml+xml"/>
      <item
          id="c2"
          href="chap2.xhtml"
          media-type="application/xhtml+xml"/>
      <item
          id="c2-answerkey"
          href="chap2-answerkey.xhtml"
          media-type="application/xhtml+xml"/>
      <item
          id="c3"
          href="chap3.xhtml"
          media-type="application/xhtml+xml"/>
      <item
          id="c3-answerkey"
          href="chap3-answerkey.xhtml"
          media-type="application/xhtml+xml"/>
      <item
          id="notes"
          href="notes.xhtml"
          media-type="application/xhtml+xml"/>
      <item
          id="cover"
          href="images/cover.svg"
          properties="cover-image"
          media-type="image/svg+xml"/>
      <item
          id="f1"
          href="images/fig1.jpg"
          media-type="image/jpeg"/>
      <item
          id="f2"
          href="images/fig2.jpg"
          media-type="image/jpeg"/>
      <item
          id="css"
          href="style/book.css"
          media-type="text/css"/>
   </manifest>
    <spine
        page-progression-direction="ltr">
    <itemref
        idref="intro"/>
    <itemref
        idref="c1"/>
    <itemref
        idref="c1-answerkey"
        linear="no"/>
    <itemref
        idref="c2"/>
    <itemref
        idref="c2-answerkey"
        linear="no"/>
    <itemref
        idref="c3"/>
    <itemref
        idref="c3-answerkey"
        linear="no"/>
    <itemref
        idref="notes"
        linear="no"/>
    </spine>
</package>

Import the python-epub3 module

>>> from epub3 import ePub

Create an e-book, which can take an actual existing e-book path as argument

>>> book = ePub("sample.epub")
>>> book
<ePub(<{http://www.idpf.org/2007/opf}package>, attrib={'version': '3.0', 'unique-identifier': 'BookId'}) at 0x102a93810>

View metadata

>>> book.metadata
<Metadata(<{http://www.idpf.org/2007/opf}metadata>) at 0x1035c3c50>
[<DCTerm(<{http://purl.org/dc/elements/1.1/}identifier>, attrib={'id': 'BookId'}, text='urn:uuid:bb4d4afe-f787-4d21-97b8-68f6774ba342') at 0x1031ea6d0>,
 <DCTerm(<{http://purl.org/dc/elements/1.1/}language>, text='en') at 0x1035e4710>,
 <DCTerm(<{http://purl.org/dc/elements/1.1/}title>, text='ePub') at 0x1035a00d0>,
 <Meta(<{http://www.idpf.org/2007/opf}meta>, attrib={'property': 'dcterms:modified'}, text='2989-06-04T00:00:00Z') at 0x1035a0850>]

View the identifier, i.e. dc:identifier

>>> identifier = book.identifier
>>> identifier
'urn:uuid:bb4d4afe-f787-4d21-97b8-68f6774ba342'
>>> isinstance(identifier, str)
True

View and modify the title, i.e. dc:title

>>> title = book.title
>>> title
'ePub'
>>> book.title = "my first book"
>>> title
'my first book'

View and modify the language, i.e. dc:language

>>> language = book.language
>>> language
'en'
>>> book.language = "en-US"
>>> language
'en-US'

View and update the modification time 😂

>>> book.modified
'2989-06-04T00:00:00Z'
>>> e.mark_modified()
'3000-01-01T00:00:00Z'

View metadata again

>>> book.metadata
<Metadata(<{http://www.idpf.org/2007/opf}metadata>) at 0x1075cdfd0>
[<DCTerm(<{http://purl.org/dc/elements/1.1/}identifier>, attrib={'id': 'BookId'}, text='urn:uuid:bb4d4afe-f787-4d21-97b8-68f6774ba342') at 0x10750c350>,
 <DCTerm(<{http://purl.org/dc/elements/1.1/}language>, text='en') at 0x10a6835d0>,
 <DCTerm(<{http://purl.org/dc/elements/1.1/}title>, text='ePub') at 0x10a682550>,
 <Meta(<{http://www.idpf.org/2007/opf}meta>, attrib={'property': 'dcterms:modified'}, text='3000-01-01T00:00:00Z') at 0x10a77f6d0>]

View manifest

>>> book.manifest
{'nav': <Item(<{http://www.idpf.org/2007/opf}item>, attrib={'id': 'nav', 'href': 'nav.xhtml', 'properties': 'nav', 'media-type': 'application/xhtml+xml'}) at 0x1073e1e10>,
 'intro': <Item(<{http://www.idpf.org/2007/opf}item>, attrib={'id': 'intro', 'href': 'intro.xhtml', 'media-type': 'application/xhtml+xml'}) at 0x1073e2190>,
 'c1': <Item(<{http://www.idpf.org/2007/opf}item>, attrib={'id': 'c1', 'href': 'chap1.xhtml', 'media-type': 'application/xhtml+xml'}) at 0x1073e25d0>,
 'c1-answerkey': <Item(<{http://www.idpf.org/2007/opf}item>, attrib={'id': 'c1-answerkey', 'href': 'chap1-answerkey.xhtml', 'media-type': 'application/xhtml+xml'}) at 0x1073e2990>,
 'c2': <Item(<{http://www.idpf.org/2007/opf}item>, attrib={'id': 'c2', 'href': 'chap2.xhtml', 'media-type': 'application/xhtml+xml'}) at 0x1073e3350>,
 'c2-answerkey': <Item(<{http://www.idpf.org/2007/opf}item>, attrib={'id': 'c2-answerkey', 'href': 'chap2-answerkey.xhtml', 'media-type': 'application/xhtml+xml'}) at 0x1075aded0>,
 'c3': <Item(<{http://www.idpf.org/2007/opf}item>, attrib={'id': 'c3', 'href': 'chap3.xhtml', 'media-type': 'application/xhtml+xml'}) at 0x1075af950>,
 'c3-answerkey': <Item(<{http://www.idpf.org/2007/opf}item>, attrib={'id': 'c3-answerkey', 'href': 'chap3-answerkey.xhtml', 'media-type': 'application/xhtml+xml'}) at 0x1075ae710>,
 'notes': <Item(<{http://www.idpf.org/2007/opf}item>, attrib={'id': 'notes', 'href': 'notes.xhtml', 'media-type': 'application/xhtml+xml'}) at 0x1075ae3d0>,
 'cover': <Item(<{http://www.idpf.org/2007/opf}item>, attrib={'id': 'cover', 'href': 'images/cover.svg', 'properties': 'cover-image', 'media-type': 'image/svg+xml'}) at 0x1075ae610>,
 'f1': <Item(<{http://www.idpf.org/2007/opf}item>, attrib={'id': 'f1', 'href': 'images/fig1.jpg', 'media-type': 'image/jpeg'}) at 0x109a39950>,
 'f2': <Item(<{http://www.idpf.org/2007/opf}item>, attrib={'id': 'f2', 'href': 'images/fig2.jpg', 'media-type': 'image/jpeg'}) at 0x107534310>,
 'css': <Item(<{http://www.idpf.org/2007/opf}item>, attrib={'id': 'css', 'href': 'style/book.css', 'media-type': 'text/css'}) at 0x107534290>}

>>> book.manifest.list()
[<Item(<{http://www.idpf.org/2007/opf}item>, attrib={'id': 'nav', 'href': 'nav.xhtml', 'properties': 'nav', 'media-type': 'application/xhtml+xml'}) at 0x1073e1e10>,
 <Item(<{http://www.idpf.org/2007/opf}item>, attrib={'id': 'intro', 'href': 'intro.xhtml', 'media-type': 'application/xhtml+xml'}) at 0x1073e2190>,
 <Item(<{http://www.idpf.org/2007/opf}item>, attrib={'id': 'c1', 'href': 'chap1.xhtml', 'media-type': 'application/xhtml+xml'}) at 0x1073e25d0>,
 <Item(<{http://www.idpf.org/2007/opf}item>, attrib={'id': 'c1-answerkey', 'href': 'chap1-answerkey.xhtml', 'media-type': 'application/xhtml+xml'}) at 0x1073e2990>,
 <Item(<{http://www.idpf.org/2007/opf}item>, attrib={'id': 'c2', 'href': 'chap2.xhtml', 'media-type': 'application/xhtml+xml'}) at 0x1073e3350>,
 <Item(<{http://www.idpf.org/2007/opf}item>, attrib={'id': 'c2-answerkey', 'href': 'chap2-answerkey.xhtml', 'media-type': 'application/xhtml+xml'}) at 0x1075aded0>,
 <Item(<{http://www.idpf.org/2007/opf}item>, attrib={'id': 'c3', 'href': 'chap3.xhtml', 'media-type': 'application/xhtml+xml'}) at 0x1075af950>,
 <Item(<{http://www.idpf.org/2007/opf}item>, attrib={'id': 'c3-answerkey', 'href': 'chap3-answerkey.xhtml', 'media-type': 'application/xhtml+xml'}) at 0x1075ae710>,
 <Item(<{http://www.idpf.org/2007/opf}item>, attrib={'id': 'notes', 'href': 'notes.xhtml', 'media-type': 'application/xhtml+xml'}) at 0x1075ae3d0>,
 <Item(<{http://www.idpf.org/2007/opf}item>, attrib={'id': 'cover', 'href': 'images/cover.svg', 'properties': 'cover-image', 'media-type': 'image/svg+xml'}) at 0x1075ae610>,
 <Item(<{http://www.idpf.org/2007/opf}item>, attrib={'id': 'f1', 'href': 'images/fig1.jpg', 'media-type': 'image/jpeg'}) at 0x109a39950>,
 <Item(<{http://www.idpf.org/2007/opf}item>, attrib={'id': 'f2', 'href': 'images/fig2.jpg', 'media-type': 'image/jpeg'}) at 0x107534310>,
 <Item(<{http://www.idpf.org/2007/opf}item>, attrib={'id': 'css', 'href': 'style/book.css', 'media-type': 'text/css'}) at 0x107534290>]

Get an item

>>> book.manifest[0]
<Item(<{http://www.idpf.org/2007/opf}item>, attrib={'id': 'nav', 'href': 'nav.xhtml', 'properties': 'nav', 'media-type': 'application/xhtml+xml'}) at 0x1073e1e10>

>>>book.manifest['nav'] 
<Item(<{http://www.idpf.org/2007/opf}item>, attrib={'id': 'nav', 'href': 'nav.xhtml', 'properties': 'nav', 'media-type': 'application/xhtml+xml'}) at 0x1073e1e10>

>>> book.manifest('nav.xhtml')
<Item(<{http://www.idpf.org/2007/opf}item>, attrib={'id': 'nav', 'href': 'nav.xhtml', 'properties': 'nav', 'media-type': 'application/xhtml+xml'}) at 0x1073e1e10>

View spine

>>> book.spine
{'intro': <Itemref(<{http://www.idpf.org/2007/opf}itemref>, attrib={'idref': 'intro'}) at 0x107533c90>,
 'c1': <Itemref(<{http://www.idpf.org/2007/opf}itemref>, attrib={'idref': 'c1'}) at 0x109a88ed0>,
 'c1-answerkey': <Itemref(<{http://www.idpf.org/2007/opf}itemref>, attrib={'idref': 'c1-answerkey'}) at 0x109a88f50>,
 'c2': <Itemref(<{http://www.idpf.org/2007/opf}itemref>, attrib={'idref': 'c2'}) at 0x109a89110>,
 'c2-answerkey': <Itemref(<{http://www.idpf.org/2007/opf}itemref>, attrib={'idref': 'c2-answerkey'}) at 0x109a891d0>,
 'c3': <Itemref(<{http://www.idpf.org/2007/opf}itemref>, attrib={'idref': 'c3'}) at 0x109a89290>,
 'c3-answerkey': <Itemref(<{http://www.idpf.org/2007/opf}itemref>, attrib={'idref': 'c3-answerkey'}) at 0x109a89350>,
 'notes': <Itemref(<{http://www.idpf.org/2007/opf}itemref>, attrib={'idref': 'notes'}) at 0x109a893d0>}

>>> book.spine.list()
[<Itemref(<{http://www.idpf.org/2007/opf}itemref>, attrib={'idref': 'intro'}) at 0x107533c90>,
 <Itemref(<{http://www.idpf.org/2007/opf}itemref>, attrib={'idref': 'c1'}) at 0x109a88ed0>,
 <Itemref(<{http://www.idpf.org/2007/opf}itemref>, attrib={'idref': 'c1-answerkey'}) at 0x109a88f50>,
 <Itemref(<{http://www.idpf.org/2007/opf}itemref>, attrib={'idref': 'c2'}) at 0x109a89110>,
 <Itemref(<{http://www.idpf.org/2007/opf}itemref>, attrib={'idref': 'c2-answerkey'}) at 0x109a891d0>,
 <Itemref(<{http://www.idpf.org/2007/opf}itemref>, attrib={'idref': 'c3'}) at 0x109a89290>,
 <Itemref(<{http://www.idpf.org/2007/opf}itemref>, attrib={'idref': 'c3-answerkey'}) at 0x109a89350>,
 <Itemref(<{http://www.idpf.org/2007/opf}itemref>, attrib={'idref': 'notes'}) at 0x109a893d0>]

Get an itemref

>>> book.spine[0]
<Itemref(<{http://www.idpf.org/2007/opf}itemref>, attrib={'idref': 'intro'}) at 0x107533c90>

>>>book.manifest['intro'] 
<Itemref(<{http://www.idpf.org/2007/opf}itemref>, attrib={'idref': 'intro'}) at 0x107533c90>

Add a file

>>> item = book.manifest.add("chapter0001.xhtml", id="chapter0001")
>>> item
<Item(<{http://www.idpf.org/2007/opf}item>, attrib={'id': 'chapter0001', 'href': 'chapter0001.xhtml', 'media-type': 'application/xhtml+xml'}) at 0x1079bb190>

Open and write some textual data to it

>>> file = item.open("w")
>>> file
<_io.TextIOWrapper name='/var/folders/k1/3r19jl7d30n834vdmbz9ygh80000gn/T/tmpzubn_x2f/69bccdc4-50b5-404a-8117-33fe47648f3a' encoding='utf-8'>
>>> file.write('''<?xml version="1.0" encoding="utf-8"?><!DOCTYPE html>
... <html xmlns="http://www.w3.org/1999/xhtml" xmlns:epub="http://www.idpf.org/2007/ops">
... <head>
...   <title></title>
... </head>
... <body>
...   <p>&#160;</p>
... </body>
... </html>''')
211
>>> file.close()

Read it again

>>> print(item.read_text())
<?xml version="1.0" encoding="utf-8"?><!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:epub="http://www.idpf.org/2007/ops">
<head>
  <title></title>
</head>
<body>
  <p>&#160;</p>
</body>
</html>

Add the item to spine

>>> book.spine.add(item)
<Itemref(<{http://www.idpf.org/2007/opf}itemref>, attrib={'idref': 'chapter0001'}) at 0x1133e4510>

Add an external file

>>> item = book.manifest.add("features.js", "js/features.js")
>>> item
<Item(<{http://www.idpf.org/2007/opf}item>, attrib={'id': 'c8d322e0-a960-44ea-bf15-66d1dbbce15d', 'href': 'features.js', 'media-type': 'text/javascript'}) at 0x1038db390>

Add a dc:creator metadata

>>> book.metadata.add("dc:creator", dict(id="creator"), text="ChenyangGao")
<DCTerm(<{http://purl.org/dc/elements/1.1/}creator>, attrib={'id': 'creator'}, text='ChenyangGao') at 0x103ced950>

Add a <meta> metadata

>>> book.metadata.add("meta", dict(refines="#creator", property="role", scheme="marc:relators", id="role"), text="author")
<Meta(<{http://www.idpf.org/2007/opf}meta>, attrib={'refines': '#creator', 'property': 'role', 'scheme': 'marc:relators', 'id': 'role'}, text='author') at 0x105128a50>

Find metadata

>>> book.metadata.find("dc:creator")
<DCTerm(<{http://purl.org/dc/elements/1.1/}creator>, attrib={'id': 'creator'}, text='ChenyangGao') at 0x103ced950>
>>> book.metadata.dc("creator")
<DCTerm(<{http://purl.org/dc/elements/1.1/}creator>, attrib={'id': 'creator'}, text='ChenyangGao') at 0x103ced950>
>>> book.metadata.meta('[@property="role"]')
<Meta(<{http://www.idpf.org/2007/opf}meta>, attrib={'refines': '#creator', 'property': 'role', 'scheme': 'marc:relators', 'id': 'role'}, text='author') at 0x105128a50>
>>> book.metadata.property_meta("role")
<Meta(<{http://www.idpf.org/2007/opf}meta>, attrib={'refines': '#creator', 'property': 'role', 'scheme': 'marc:relators', 'id': 'role'}, text='author') at 0x105128a50>

Pack the book

>>> book.pack("book_i_made.epub")

View tutorial for more details.

Features

  • Proxy underlying XML element nodes to operate on OPF document.

  • Support querying nodes using ElementPath.

  • Manifest supports file system interfaces, referenced os.path, shutil, pathlib.Path.

  • Numerous lazy loading features, just like Occam's razor.

    Entities should not be multiplied unnecessarily.
    -- Occam's razor

    We are to admit no more causes of natural things than such as are both true and sufficient to explain their appearances.
    -- Isaac Newton

    Everything should be made as simple as possible, but no simpler.
    -- Albert Einstein

  • Caching instance, not created repeatedly, and recycled in a timely manner.

  • Allow adding any openable files, as long as there is an open method and its parameters are compatible with open.

  • Stream processing, supporting various operators such as map, reduce, filter, etc.

  • Various proxies and bindings fully realize multiple ways to achieve the same operational objective.

Documentation

https://python-epub3.readthedocs.io

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

python_epub3-0.0.1.15.tar.gz (47.2 kB view details)

Uploaded Source

Built Distribution

python_epub3-0.0.1.15-py3-none-any.whl (50.0 kB view details)

Uploaded Python 3

File details

Details for the file python_epub3-0.0.1.15.tar.gz.

File metadata

  • Download URL: python_epub3-0.0.1.15.tar.gz
  • Upload date:
  • Size: 47.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.7.0 CPython/3.11.4 Darwin/21.6.0

File hashes

Hashes for python_epub3-0.0.1.15.tar.gz
Algorithm Hash digest
SHA256 409ec7543864f90e166dba0de6ebfd6ef0882af31040056a63166c8b6e0fbffb
MD5 508ca890373dd9b92435d5f57d649ef9
BLAKE2b-256 37e5ebcff11f2b42c011bd47d5d7ead314d04b4b06c6e0d7aadb2c467757063b

See more details on using hashes here.

File details

Details for the file python_epub3-0.0.1.15-py3-none-any.whl.

File metadata

  • Download URL: python_epub3-0.0.1.15-py3-none-any.whl
  • Upload date:
  • Size: 50.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.7.0 CPython/3.11.4 Darwin/21.6.0

File hashes

Hashes for python_epub3-0.0.1.15-py3-none-any.whl
Algorithm Hash digest
SHA256 1b786e6fd555410d2acdbacd21ce0c85ab178ad781752cb5574307d2ccbb0e93
MD5 547ca4966dc110ac2c23d3fa8146cb57
BLAKE2b-256 513a087088fed2fd8b25a4f34ec6032dafb0464217f688c6795337e74f797fed

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page