Skip to main content

A python-facing API for creating and interacting with ZIM files

Project description

python-libzim

libzim module allows you to read and write ZIM files in Python. It provides a shallow python interface on top of the C++ libzim library.

It is primarily used in openZIM scrapers like sotoki or youtube2zim.

Build Status CodeFactor License: GPL v3 PyPI version shields.io codecov

Installation

pip install libzim

The PyPI package is available for x86_64 macOS and GNU/Linux only. It bundles a recent release of the C++ libzim.

On other platforms, you'd have to compile C++ libzim from source first then build this one, adjusting LD_LIBRARY_PATH.

Contributions

git clone git@github.com:openzim/python-libzim.git && cd python-libzim
# python -m venv env && source env/bin/activate
pip install -U setuptools invoke
invoke download-libzim install-dev build-ext test
# invoke --list for available development helpers

See CONTRIBUTING.md for additional details then Open a ticket or submit a Pull Request on Github 🤗!

Usage

Read a ZIM file

from libzim.reader import Archive
from libzim.search import Query, Searcher
from libzim.suggestion import SuggestionSearcher

zim = Archive("test.zim")
print(f"Main entry is at {zim.main_entry.get_item().path}")
entry = zim.get_entry_by_path("home/fr")
print(f"Entry {entry.title} at {entry.path} is {entry.get_item().size}b.")
print(bytes(entry.get_item().content).decode("UTF-8"))

# searching using full-text index
search_string = "Welcome"
query = Query().set_query(search_string)
searcher = Searcher(zim)
search = searcher.search(query)
search_count = search.getEstimatedMatches()
print(f"there are {search_count} matches for {search_string}")
print(list(search.getResults(0, search_count)))

# accessing suggestions
search_string = "kiwix"
suggestion_searcher = SuggestionSearcher(zim)
suggestion = suggestion_searcher.suggest(search_string)
suggestion_count = suggestion.getEstimatedMatches()
print(f"there are {suggestion_count} matches for {search_string}")
print(list(suggestion.getResults(0, suggestion_count)))

Write a ZIM file

from libzim.writer import Creator, Item, StringProvider, FileProvider, Hint


class MyItem(Item):
    def __init__(self, title, path, content = "", fpath = None):
        super().__init__()
        self.path = path
        self.title = title
        self.content = content
        self.fpath = fpath

    def get_path(self):
        return self.path

    def get_title(self):
        return self.title

    def get_mimetype(self):
        return "text/html"

    def get_contentprovider(self):
        if self.fpath is not None:
            return FileProvider(self.fpath)
        return StringProvider(self.content)
       
    def get_hints(self):
        return {Hint.FRONT_ARTICLE: True}


content = """<html><head><meta charset="UTF-8"><title>Web Page Title</title></head>
<body><h1>Welcome to this ZIM</h1><p>Kiwix</p></body></html>"""

item = MyItem("Hello Kiwix", "home", content)
item2 = MyItem("Bonjour Kiwix", "home/fr", None, "home-fr.html")

with Creator("test.zim").config_indexing(True, "eng") as creator:
    creator.set_mainpath("home")
    creator.add_item(item)
    creator.add_item(item2)
    for name, value in {
        "creator": "python-libzim",
        "description": "Created in python",
        "name": "my-zim",
        "publisher": "You",
        "title": "Test ZIM",
    }.items():

        creator.add_metadata(name.title(), value)

License

GPLv3 or later, see LICENSE for more details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

libzim-1.1.1.tar.gz (8.3 MB view details)

Uploaded Source

Built Distributions

libzim-1.1.1-cp310-cp310-manylinux1_x86_64.whl (9.5 MB view details)

Uploaded CPython 3.10

libzim-1.1.1-cp310-cp310-macosx_10_9_x86_64.whl (7.9 MB view details)

Uploaded CPython 3.10 macOS 10.9+ x86-64

libzim-1.1.1-cp39-cp39-manylinux1_x86_64.whl (9.5 MB view details)

Uploaded CPython 3.9

libzim-1.1.1-cp39-cp39-macosx_10_9_x86_64.whl (7.9 MB view details)

Uploaded CPython 3.9 macOS 10.9+ x86-64

libzim-1.1.1-cp38-cp38-manylinux1_x86_64.whl (9.5 MB view details)

Uploaded CPython 3.8

libzim-1.1.1-cp38-cp38-macosx_10_9_x86_64.whl (7.9 MB view details)

Uploaded CPython 3.8 macOS 10.9+ x86-64

libzim-1.1.1-cp37-cp37m-manylinux1_x86_64.whl (9.4 MB view details)

Uploaded CPython 3.7m

libzim-1.1.1-cp37-cp37m-macosx_10_9_x86_64.whl (7.9 MB view details)

Uploaded CPython 3.7m macOS 10.9+ x86-64

libzim-1.1.1-cp36-cp36m-manylinux1_x86_64.whl (9.4 MB view details)

Uploaded CPython 3.6m

libzim-1.1.1-cp36-cp36m-macosx_10_9_x86_64.whl (7.9 MB view details)

Uploaded CPython 3.6m macOS 10.9+ x86-64

File details

Details for the file libzim-1.1.1.tar.gz.

File metadata

  • Download URL: libzim-1.1.1.tar.gz
  • Upload date:
  • Size: 8.3 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.8.12

File hashes

Hashes for libzim-1.1.1.tar.gz
Algorithm Hash digest
SHA256 2f262eb65fa952e259677b62b7c9ceea62d39a269b08fb9127810b1ca3c85004
MD5 4ed7952b5e502fa98e991899076438e3
BLAKE2b-256 2586cadebdf5bb4be29d7ea5ccd6483c516ec80757f183cc8c5df2ff7800baae

See more details on using hashes here.

File details

Details for the file libzim-1.1.1-cp310-cp310-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for libzim-1.1.1-cp310-cp310-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 93b979872c3300d3b2eab12a1b738dbbf67c17560d9002b26e99c9b41b56ac49
MD5 5a30aaa6326076fda169e92b7e80e596
BLAKE2b-256 ea54c60ec4ae2b72c0f43f671acf95dde27ab9200b1c8c144d1d55fd834ba360

See more details on using hashes here.

File details

Details for the file libzim-1.1.1-cp310-cp310-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for libzim-1.1.1-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 7c6f9e573c7cdc9d2f9f4355d1c1af24b6e5e3bd76742a1ee5c42fd8b68221a6
MD5 34616267d3c56eec5ff62b6238f2662f
BLAKE2b-256 77730fe64c247db5c868291bef6aeab9b54de45adfc8e1d148573bc7d88933f3

See more details on using hashes here.

File details

Details for the file libzim-1.1.1-cp39-cp39-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for libzim-1.1.1-cp39-cp39-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 a1f3f1a600c8f99db176519e255c4c26f3d51e3c254e8e02ee55d9aae00aed5e
MD5 a785aebec0b23b43266435613f0a843a
BLAKE2b-256 5b06e3ae2f841f109620d13a104dc086936b12601b81a49afc14f62227590b4d

See more details on using hashes here.

File details

Details for the file libzim-1.1.1-cp39-cp39-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for libzim-1.1.1-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 e476e0800dca768d2ba7afbc61e4696379a2fcb9e7639e60105918694abe3355
MD5 73610776b4a05d1283bcd571e0bc40e2
BLAKE2b-256 b101a531a92a84089dcdd0491d2c27533d9f086dc0f7b34c23591bb5af249eb8

See more details on using hashes here.

File details

Details for the file libzim-1.1.1-cp38-cp38-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for libzim-1.1.1-cp38-cp38-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 4c42d43c6fe7185c93f9e14e84477634c5a2824d66964bcbbbfd213c067d87a2
MD5 d3efa9c07fd3782118a4a346fe9b65f8
BLAKE2b-256 810415cd361e2caef5488bce40415eb4789d330c63c14a8de4ff21838766738e

See more details on using hashes here.

File details

Details for the file libzim-1.1.1-cp38-cp38-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for libzim-1.1.1-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 bc4bd130ee38f9802efcb3d2b3a593a1ffde7852a1b1a6d0aa881011283cf731
MD5 279e8fd13b3795e4938a1d4b6b6c13e9
BLAKE2b-256 f49b488fc58285326a68e9de43a74475407e6965c01bd0d14181a8cd3527ccca

See more details on using hashes here.

File details

Details for the file libzim-1.1.1-cp37-cp37m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for libzim-1.1.1-cp37-cp37m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 753f155b2b25e21b51d39dc1a8f1b8f9d643025ed6ee37ebe44351ef876c9229
MD5 2f641e21226f0eae12987cc82edcbfcc
BLAKE2b-256 b1b96851bb11cf5d9c5ce5c7220d48d85f352516ba5b3597b4ca590306ac64f6

See more details on using hashes here.

File details

Details for the file libzim-1.1.1-cp37-cp37m-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for libzim-1.1.1-cp37-cp37m-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 47ceca95f9deb47dda6f16b2cf9523df3e9ccc575b5ceba971fb5e9cfa1bb699
MD5 6902cd6d652d7c6cb3c220736422e1d3
BLAKE2b-256 c48b68e90bdfbbe92bed05e574b8d4aa946d8e5e775c14c7921a313229512736

See more details on using hashes here.

File details

Details for the file libzim-1.1.1-cp36-cp36m-manylinux1_x86_64.whl.

File metadata

  • Download URL: libzim-1.1.1-cp36-cp36m-manylinux1_x86_64.whl
  • Upload date:
  • Size: 9.4 MB
  • Tags: CPython 3.6m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.3 readme-renderer/34.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.9 tqdm/4.64.0 importlib-metadata/4.8.3 keyring/23.4.1 rfc3986/1.5.0 colorama/0.4.5 CPython/3.6.15

File hashes

Hashes for libzim-1.1.1-cp36-cp36m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 9b08d0322fb93f5262577a590dd096ec6a8969017023b896dc2817678e4c2634
MD5 a4e978ee2a339d02584831cd2ce6ae1f
BLAKE2b-256 15d8127a4a8c45b411fc356fa823d83788f5d46a488f32e82664712753d56506

See more details on using hashes here.

File details

Details for the file libzim-1.1.1-cp36-cp36m-macosx_10_9_x86_64.whl.

File metadata

  • Download URL: libzim-1.1.1-cp36-cp36m-macosx_10_9_x86_64.whl
  • Upload date:
  • Size: 7.9 MB
  • Tags: CPython 3.6m, macOS 10.9+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.3 readme-renderer/34.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.9 tqdm/4.64.0 importlib-metadata/4.8.3 keyring/23.4.1 rfc3986/1.5.0 colorama/0.4.5 CPython/3.6.15

File hashes

Hashes for libzim-1.1.1-cp36-cp36m-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 00d557552db5f1df194982f7cb7c855dbd36e7ef8345e8d7ed36a17b2e4c1406
MD5 c7bb502c388ed754047f4a7ebce17437
BLAKE2b-256 3b498bbbe7f4075a11a1c1efa9f90249860d2c753bcd1e4e25e3a419c12fc8dd

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page