Skip to main content

A python-facing API for creating and interacting with ZIM files

Project description

python-libzim

libzim module allows you to read and write ZIM files in Python. It provides a shallow python interface on top of the C++ libzim library.

It is primarily used in openZIM scrapers like sotoki or youtube2zim.

Build Status CodeFactor License: GPL v3 PyPI version shields.io codecov

Installation

pip install libzim

The PyPI package is available for x86_64 macOS and GNU/Linux only. It bundles a recent release of the C++ libzim.

On other platforms, you'd have to compile C++ libzim from source first then build this one, adjusting LD_LIBRARY_PATH.

Contributions

git clone git@github.com:openzim/python-libzim.git && cd python-libzim
# python -m venv env && source env/bin/activate
pip install -U setuptools invoke
invoke download-libzim install-dev build-ext test
# invoke --list for available development helpers

See CONTRIBUTING.md for additional details then Open a ticket or submit a Pull Request on Github 🤗!

Usage

Read a ZIM file

from libzim.reader import Archive
from libzim.search import Query, Searcher
from libzim.suggestion import SuggestionSearcher

zim = Archive("test.zim")
print(f"Main entry is at {zim.main_entry.get_item().path}")
entry = zim.get_entry_by_path("home/fr")
print(f"Entry {entry.title} at {entry.path} is {entry.get_item().size}b.")
print(bytes(entry.get_item().content).decode("UTF-8"))

# searching using full-text index
search_string = "Welcome"
query = Query().set_query(search_string)
searcher = Searcher(zim)
search = searcher.search(query)
search_count = search.getEstimatedMatches()
print(f"there are {search_count} matches for {search_string}")
print(list(search.getResults(0, search_count)))

# accessing suggestions
search_string = "kiwix"
suggestion_searcher = SuggestionSearcher(zim)
suggestion = suggestion_searcher.suggest(search_string)
suggestion_count = suggestion.getEstimatedMatches()
print(f"there are {suggestion_count} matches for {search_string}")
print(list(suggestion.getResults(0, suggestion_count)))

Write a ZIM file

from libzim.writer import Creator, Item, StringProvider, FileProvider, Hint


class MyItem(Item):
    def __init__(self, title, path, content = "", fpath = None):
        super().__init__()
        self.path = path
        self.title = title
        self.content = content
        self.fpath = fpath

    def get_path(self):
        return self.path

    def get_title(self):
        return self.title

    def get_mimetype(self):
        return "text/html"

    def get_contentprovider(self):
        if self.fpath is not None:
            return FileProvider(self.fpath)
        return StringProvider(self.content)
       
    def get_hints(self):
        return {Hint.FRONT_ARTICLE: True}


content = """<html><head><meta charset="UTF-8"><title>Web Page Title</title></head>
<body><h1>Welcome to this ZIM</h1><p>Kiwix</p></body></html>"""

item = MyItem("Hello Kiwix", "home", content)
item2 = MyItem("Bonjour Kiwix", "home/fr", None, "home-fr.html")

with Creator("test.zim").config_indexing(True, "eng") as creator:
    creator.set_mainpath("home")
    creator.add_item(item)
    creator.add_item(item2)
    for name, value in {
        "creator": "python-libzim",
        "description": "Created in python",
        "name": "my-zim",
        "publisher": "You",
        "title": "Test ZIM",
    }.items():

        creator.add_metadata(name.title(), value)

License

GPLv3 or later, see LICENSE for more details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

libzim-1.1.0.tar.gz (8.3 MB view details)

Uploaded Source

Built Distributions

libzim-1.1.0-cp310-cp310-manylinux1_x86_64.whl (9.5 MB view details)

Uploaded CPython 3.10

libzim-1.1.0-cp310-cp310-macosx_10_9_x86_64.whl (7.9 MB view details)

Uploaded CPython 3.10 macOS 10.9+ x86-64

libzim-1.1.0-cp39-cp39-manylinux1_x86_64.whl (9.5 MB view details)

Uploaded CPython 3.9

libzim-1.1.0-cp39-cp39-macosx_10_9_x86_64.whl (7.9 MB view details)

Uploaded CPython 3.9 macOS 10.9+ x86-64

libzim-1.1.0-cp38-cp38-manylinux1_x86_64.whl (9.5 MB view details)

Uploaded CPython 3.8

libzim-1.1.0-cp38-cp38-macosx_10_9_x86_64.whl (7.9 MB view details)

Uploaded CPython 3.8 macOS 10.9+ x86-64

libzim-1.1.0-cp37-cp37m-manylinux1_x86_64.whl (9.4 MB view details)

Uploaded CPython 3.7m

libzim-1.1.0-cp37-cp37m-macosx_10_9_x86_64.whl (7.9 MB view details)

Uploaded CPython 3.7m macOS 10.9+ x86-64

libzim-1.1.0-cp36-cp36m-manylinux1_x86_64.whl (9.4 MB view details)

Uploaded CPython 3.6m

libzim-1.1.0-cp36-cp36m-macosx_10_9_x86_64.whl (7.9 MB view details)

Uploaded CPython 3.6m macOS 10.9+ x86-64

File details

Details for the file libzim-1.1.0.tar.gz.

File metadata

  • Download URL: libzim-1.1.0.tar.gz
  • Upload date:
  • Size: 8.3 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/34.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.9 tqdm/4.64.0 importlib-metadata/4.8.3 keyring/23.4.1 rfc3986/1.5.0 colorama/0.4.4 CPython/3.6.15

File hashes

Hashes for libzim-1.1.0.tar.gz
Algorithm Hash digest
SHA256 c07ce098ba2126c42ae565be86dd24cd7f65e442ca17cbb074c319f5cf087d28
MD5 d238fc4ac440731c56b81979db850122
BLAKE2b-256 bffec14bfce31515854bd10efe6017393ccedc9a82f299a10523b9551cc20e45

See more details on using hashes here.

File details

Details for the file libzim-1.1.0-cp310-cp310-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for libzim-1.1.0-cp310-cp310-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 321c55a55218ac8ea880e70083f10ba3014ef4435943c73831fe3178c7cf3500
MD5 48d5050c4f62f6bdc62c9cd1f8486932
BLAKE2b-256 5543fc2ae2ece51a4fb66f46635be36e95043f66cced3e31cf5fbc8d3296ce7c

See more details on using hashes here.

File details

Details for the file libzim-1.1.0-cp310-cp310-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for libzim-1.1.0-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 a563aac41312ffcfb253a6a3111719cfaf34a042f25f8d9259faac2c9cccb2d5
MD5 980c8353c393491ae6424c856119d93f
BLAKE2b-256 294373256a5826263f908a3094720676989e6e9a55e0caad7b85924227a8a55a

See more details on using hashes here.

File details

Details for the file libzim-1.1.0-cp39-cp39-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for libzim-1.1.0-cp39-cp39-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 e67e137f01558ddf86ceac06fc9487efd1cf8d6396062ddb11b9d9c1ff18ed82
MD5 a522854670e125b94cd28831f65f14fc
BLAKE2b-256 544b206f3c378fa056e2773d9abc273023bc4535e00a40dda01c6e4e48a6ec41

See more details on using hashes here.

File details

Details for the file libzim-1.1.0-cp39-cp39-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for libzim-1.1.0-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 a9ef3e066ff8511335a1bcee102e20620b5ff847bc0515f20979bea34a2f0cd1
MD5 8fadc4ae1e608002c0a2578500ab0144
BLAKE2b-256 047abee13d64ef4308fa301883b3272dae70bba14cf9169a377146adef80235a

See more details on using hashes here.

File details

Details for the file libzim-1.1.0-cp38-cp38-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for libzim-1.1.0-cp38-cp38-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 9020c8d9d110a8a00bf2a9fa5b63862d8832c4f19eb6ea7fb5bc4aab253b6462
MD5 64ff60906e0157c74547be6b57fe363a
BLAKE2b-256 9a4a8399d70d494cb48f90c942067ccaf8aad133f609e1e7fbb4fa817eaa6f3b

See more details on using hashes here.

File details

Details for the file libzim-1.1.0-cp38-cp38-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for libzim-1.1.0-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 26f68e785ecd721c752389550103b0fc6691b1819b277658a3f736b1e1fc60fc
MD5 961937d11598612308e200819ed2d951
BLAKE2b-256 87cba3ece5cde8f244064ecd31fb151242cd6ef3826456ce844c14e3e367afe3

See more details on using hashes here.

File details

Details for the file libzim-1.1.0-cp37-cp37m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for libzim-1.1.0-cp37-cp37m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 4d1906a5208415915a22824d0893bcb9db66ee3f1f6c8e26ebadf04cafb27a12
MD5 d119c2b1deab4a834d3901d71eae9f58
BLAKE2b-256 9e9763fc6500bbd967b39e9e156af49e044c9654231e5a337d9c2e54178fa934

See more details on using hashes here.

File details

Details for the file libzim-1.1.0-cp37-cp37m-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for libzim-1.1.0-cp37-cp37m-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 cd6c7ee3a984a9682867529a08be27ae649ca08c731750f508cab72e75820b6f
MD5 648b6f88942ae9b781d1534675b21b7c
BLAKE2b-256 05889aa47365fcc9ba6a7527de30388e8aebe17cd1c030eeaa14b5be8befa73d

See more details on using hashes here.

File details

Details for the file libzim-1.1.0-cp36-cp36m-manylinux1_x86_64.whl.

File metadata

  • Download URL: libzim-1.1.0-cp36-cp36m-manylinux1_x86_64.whl
  • Upload date:
  • Size: 9.4 MB
  • Tags: CPython 3.6m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/34.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.9 tqdm/4.64.0 importlib-metadata/4.8.3 keyring/23.4.1 rfc3986/1.5.0 colorama/0.4.4 CPython/3.6.15

File hashes

Hashes for libzim-1.1.0-cp36-cp36m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 d707731e0772b5f6f0463534eeaa71e3fc43128e54a43233b073d3239fdae740
MD5 3fdc1eeeb44f90e4ab523477c97046aa
BLAKE2b-256 4f6057213607b9d7033a25f53f75ec03ab041de87ef878b4cbc67e3b3bb4d1ac

See more details on using hashes here.

File details

Details for the file libzim-1.1.0-cp36-cp36m-macosx_10_9_x86_64.whl.

File metadata

  • Download URL: libzim-1.1.0-cp36-cp36m-macosx_10_9_x86_64.whl
  • Upload date:
  • Size: 7.9 MB
  • Tags: CPython 3.6m, macOS 10.9+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/34.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.9 tqdm/4.64.0 importlib-metadata/4.8.3 keyring/23.4.1 rfc3986/1.5.0 colorama/0.4.4 CPython/3.6.15

File hashes

Hashes for libzim-1.1.0-cp36-cp36m-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 9cfe47e6eebcf7226f6a1156b9474b37adb5624ee80b051a8834adf5fdd26509
MD5 3f9cc4a30c897716538fecfce189e5f6
BLAKE2b-256 b4f0227e4b661f00c43b4ab08d940d81a15ceb964cf0acea927b391bb59c2d18

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page