Skip to main content

A python-facing API for creating and interacting with ZIM files

Project description

python-libzim

libzim module allows you to read and write ZIM files in Python. It provides a shallow python interface on top of the C++ libzim library.

It is primarily used in openZIM scrapers like sotoki or youtube2zim.

Build Status CodeFactor License: GPL v3 PyPI version shields.io codecov

Installation

pip install libzim

The PyPI package is available for x86_64 macOS and GNU/Linux only. It bundles a recent release of the C++ libzim.

On other platforms, you'd have to compile C++ libzim from source first then build this one, adjusting LD_LIBRARY_PATH.

Contributions

git clone git@github.com:openzim/python-libzim.git && cd python-libzim
# python -m venv env && source env/bin/activate
pip install -U setuptools invoke
invoke download-libzim install-dev build-ext test
# invoke --list for available development helpers

See CONTRIBUTING.md for additional details then Open a ticket or submit a Pull Request on Github 🤗!

Usage

Read a ZIM file

from libzim.reader import Archive
from libzim.search import Query, Searcher
from libzim.suggestion import SuggestionSearcher

zim = Archive("test.zim")
print(f"Main entry is at {zim.main_entry.get_item().path}")
entry = zim.get_entry_by_path("home/fr")
print(f"Entry {entry.title} at {entry.path} is {entry.get_item().size}b.")
print(bytes(entry.get_item().content).decode("UTF-8"))

# searching using full-text index
search_string = "Welcome"
query = Query().set_query(search_string)
searcher = Searcher(zim)
search = searcher.search(query)
search_count = search.getEstimatedMatches()
print(f"there are {search_count} matches for {search_string}")
print(list(search.getResults(0, search_count)))

# accessing suggestions
search_string = "kiwix"
suggestion_searcher = SuggestionSearcher(zim)
suggestion = suggestion_searcher.suggest(search_string)
suggestion_count = suggestion.getEstimatedMatches()
print(f"there are {suggestion_count} matches for {search_string}")
print(list(suggestion.getResults(0, suggestion_count)))

Write a ZIM file

from libzim.writer import Creator, Item, StringProvider, FileProvider, Hint


class MyItem(Item):
    def __init__(self, title, path, content = "", fpath = None):
        super().__init__()
        self.path = path
        self.title = title
        self.content = content
        self.fpath = fpath

    def get_path(self):
        return self.path

    def get_title(self):
        return self.title

    def get_mimetype(self):
        return "text/html"

    def get_contentprovider(self):
        if self.fpath is not None:
            return FileProvider(self.fpath)
        return StringProvider(self.content)
       
    def get_hints(self):
        return {Hint.FRONT_ARTICLE: True}


content = """<html><head><meta charset="UTF-8"><title>Web Page Title</title></head>
<body><h1>Welcome to this ZIM</h1><p>Kiwix</p></body></html>"""

item = MyItem("Hello Kiwix", "home", content)
item2 = MyItem("Bonjour Kiwix", "home/fr", None, "home-fr.html")

with Creator("test.zim").config_indexing(True, "eng") as creator:
    creator.set_mainpath("home")
    creator.add_item(item)
    creator.add_item(item2)
    for name, value in {
        "creator": "python-libzim",
        "description": "Created in python",
        "name": "my-zim",
        "publisher": "You",
        "title": "Test ZIM",
    }.items():

        creator.add_metadata(name.title(), value)

License

GPLv3 or later, see LICENSE for more details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

libzim-3.0.0.tar.gz (8.3 MB view details)

Uploaded Source

Built Distributions

libzim-3.0.0-cp311-cp311-manylinux1_x86_64.whl (9.6 MB view details)

Uploaded CPython 3.11

libzim-3.0.0-cp311-cp311-macosx_10_9_x86_64.whl (8.0 MB view details)

Uploaded CPython 3.11 macOS 10.9+ x86-64

libzim-3.0.0-cp310-cp310-manylinux1_x86_64.whl (9.6 MB view details)

Uploaded CPython 3.10

libzim-3.0.0-cp310-cp310-macosx_10_9_x86_64.whl (7.9 MB view details)

Uploaded CPython 3.10 macOS 10.9+ x86-64

libzim-3.0.0-cp39-cp39-manylinux1_x86_64.whl (9.6 MB view details)

Uploaded CPython 3.9

libzim-3.0.0-cp39-cp39-macosx_10_9_x86_64.whl (7.9 MB view details)

Uploaded CPython 3.9 macOS 10.9+ x86-64

libzim-3.0.0-cp38-cp38-manylinux1_x86_64.whl (9.7 MB view details)

Uploaded CPython 3.8

libzim-3.0.0-cp38-cp38-macosx_10_9_x86_64.whl (7.9 MB view details)

Uploaded CPython 3.8 macOS 10.9+ x86-64

libzim-3.0.0-cp37-cp37m-manylinux1_x86_64.whl (9.5 MB view details)

Uploaded CPython 3.7m

libzim-3.0.0-cp37-cp37m-macosx_10_9_x86_64.whl (7.9 MB view details)

Uploaded CPython 3.7m macOS 10.9+ x86-64

libzim-3.0.0-cp36-cp36m-manylinux1_x86_64.whl (9.5 MB view details)

Uploaded CPython 3.6m

libzim-3.0.0-cp36-cp36m-macosx_10_9_x86_64.whl (7.9 MB view details)

Uploaded CPython 3.6m macOS 10.9+ x86-64

File details

Details for the file libzim-3.0.0.tar.gz.

File metadata

  • Download URL: libzim-3.0.0.tar.gz
  • Upload date:
  • Size: 8.3 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.8.16

File hashes

Hashes for libzim-3.0.0.tar.gz
Algorithm Hash digest
SHA256 5577fde3ead37d289fac7a4758b0999cf6e71902ec6685ebc1445e0e6c6a700b
MD5 51aee64908f199a298081aa074c103d6
BLAKE2b-256 99883ff4dd75f92d54754937d28a20b39efe29b7297d068a7adc93caead9fb51

See more details on using hashes here.

File details

Details for the file libzim-3.0.0-cp311-cp311-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for libzim-3.0.0-cp311-cp311-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 833988420508e30763f62007dec5be0e55d3c1f1acd9d30000d06f71bbdfd531
MD5 c7132af4b0d77926a6c7a83ddaf23d85
BLAKE2b-256 8355809cc67c2a91cada8b718f55fd6bb20e14ec42ed7f02fbbd9ad7b12c6b0f

See more details on using hashes here.

File details

Details for the file libzim-3.0.0-cp311-cp311-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for libzim-3.0.0-cp311-cp311-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 ca21cca23f61af874d2c5207a48c1615ed94ce7fea6da17f9d689b54f1d07a14
MD5 ff3fd98d4283aec030ef2b76e0d0d1bd
BLAKE2b-256 88d9437fac8414061b622527a50fb9ab15c3d5d35ffb434510b32536be228e8b

See more details on using hashes here.

File details

Details for the file libzim-3.0.0-cp310-cp310-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for libzim-3.0.0-cp310-cp310-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 2c01a7a61a8a182c1c5ab973df1ffe698f928a5a57108ad758eba7620c1bf046
MD5 90e66e353b8b1fbfd66ae238b31f7477
BLAKE2b-256 357c26255e476f6847bafce67f660d018b7af1375cbf86fb93d50f1e6cc62060

See more details on using hashes here.

File details

Details for the file libzim-3.0.0-cp310-cp310-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for libzim-3.0.0-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 39eb0221b789de0b816b4803230afec34309d0b409c5822c8ed5be9a881bf3a8
MD5 a701e5505c494d7ba89d208f57d73245
BLAKE2b-256 4c999c83435aa4f9e2ee6a4edb130707f39fee42539d1421e2281bacb1c65b23

See more details on using hashes here.

File details

Details for the file libzim-3.0.0-cp39-cp39-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for libzim-3.0.0-cp39-cp39-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 6a19f19c2fd0c4a77f3afb55e7087b6b8aa96a5b90ccfe6b8fff7df792148b1b
MD5 7e1fb3a5079d87826c5fc63e196a1832
BLAKE2b-256 64f831429d0155eb91d57ec0c1c985d004105e7bca9b82f770c30c087125fd5b

See more details on using hashes here.

File details

Details for the file libzim-3.0.0-cp39-cp39-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for libzim-3.0.0-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 d23e2e8e4c5e5aee48686ce33be01cee1103a215d27c1485b7767e8298676849
MD5 9e89c69a5c6faeea643a158ed9c6e654
BLAKE2b-256 8ab9781eea5e7f44530ec454499c8a0512693596ab878061d68d8825fb008ae7

See more details on using hashes here.

File details

Details for the file libzim-3.0.0-cp38-cp38-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for libzim-3.0.0-cp38-cp38-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 889cebf3c82827e900ee8b3bec3042f8cc473bf096581fab189070ff1903d749
MD5 895fbe6394ba1335aff8f6dbb1a5d53a
BLAKE2b-256 b9af5d9865aa017430b225102da7bf7ee5352f92bd1550256481e3c7d0ab41ef

See more details on using hashes here.

File details

Details for the file libzim-3.0.0-cp38-cp38-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for libzim-3.0.0-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 46354ff212c1d49dec84ecf577a529e2efb01082bdaac43bb8f633c7a2cecc3c
MD5 bfaecc88078493bf58fce425ace855bb
BLAKE2b-256 f043b7c96c11589317a09ab1cfafb68ab3101b0e692a9388e55782166c98881a

See more details on using hashes here.

File details

Details for the file libzim-3.0.0-cp37-cp37m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for libzim-3.0.0-cp37-cp37m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 c7616ff96373342366a73fad7d2fc9ff5eeeb4d8454e50792743205ffe494e02
MD5 0e41032fc5259b00c94f16c3fee61679
BLAKE2b-256 770d9ee766e222bcfdd4b7c9e8158d7f850d32a318ddead5e7b58a93ccae9e65

See more details on using hashes here.

File details

Details for the file libzim-3.0.0-cp37-cp37m-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for libzim-3.0.0-cp37-cp37m-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 2f82e6a41e3d10d3a11d61a4e9c70828dd7b1f81a34e61e33a8b00814d861197
MD5 509dca60ee68a44ea40710b65065a6cb
BLAKE2b-256 dfda41397370b5ce77d16a1dc93ff30c24f8659899a5a3b94c5183a0b7d8dd44

See more details on using hashes here.

File details

Details for the file libzim-3.0.0-cp36-cp36m-manylinux1_x86_64.whl.

File metadata

  • Download URL: libzim-3.0.0-cp36-cp36m-manylinux1_x86_64.whl
  • Upload date:
  • Size: 9.5 MB
  • Tags: CPython 3.6m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.9.6 readme-renderer/34.0 requests/2.27.1 requests-toolbelt/0.10.1 urllib3/1.26.15 tqdm/4.64.1 importlib-metadata/4.8.3 keyring/23.4.1 rfc3986/1.5.0 colorama/0.4.5 CPython/3.6.15

File hashes

Hashes for libzim-3.0.0-cp36-cp36m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 bfafc998f55c1f0eeb3a81630df89b42d719d2af88b3ba6c3bcbd0ad13d7bde1
MD5 c4c6dc329adfd74eb0fea34ed5fd90a5
BLAKE2b-256 38d96de61a263359dd1b596955f8e5093f022fdcf5bd44a6b91a06edfdca1f4e

See more details on using hashes here.

File details

Details for the file libzim-3.0.0-cp36-cp36m-macosx_10_9_x86_64.whl.

File metadata

  • Download URL: libzim-3.0.0-cp36-cp36m-macosx_10_9_x86_64.whl
  • Upload date:
  • Size: 7.9 MB
  • Tags: CPython 3.6m, macOS 10.9+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.9.6 readme-renderer/34.0 requests/2.27.1 requests-toolbelt/0.10.1 urllib3/1.26.15 tqdm/4.64.1 importlib-metadata/4.8.3 keyring/23.4.1 rfc3986/1.5.0 colorama/0.4.5 CPython/3.6.15

File hashes

Hashes for libzim-3.0.0-cp36-cp36m-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 e86e255f580b709ae6573d99252eaa9a9d5f78a15d28b0cdbacd3a5e3de26f2f
MD5 f06059f7ec30186e7c22208889498435
BLAKE2b-256 0328d9fe0747ba5e07d52eeb319fa03ddb8ae793d893bb69afef9035c5834e64

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page