Skip to main content

A python-facing API for creating and interacting with ZIM files

Project description

python-libzim

libzim module allows you to read and write ZIM files in Python. It provides a shallow python interface on top of the C++ libzim library.

It is primarily used in openZIM scrapers like sotoki or youtube2zim.

Build Status CodeFactor License: GPL v3 PyPI version shields.io codecov

Installation

pip install libzim

The PyPI package is available for x86_64 macOS and GNU/Linux only. It bundles a recent release of the C++ libzim.

On other platforms, you'd have to compile C++ libzim from source first then build this one, adjusting LD_LIBRARY_PATH.

Contributions

git clone git@github.com:openzim/python-libzim.git && cd python-libzim
# python -m venv env && source env/bin/activate
pip install -U setuptools invoke
invoke download-libzim install-dev build-ext test
# invoke --list for available development helpers

See CONTRIBUTING.md for additional details then Open a ticket or submit a Pull Request on Github 🤗!

Usage

Read a ZIM file

from libzim.reader import Archive
from libzim.search import Query, Searcher
from libzim.suggestion import SuggestionSearcher

zim = Archive("test.zim")
print(f"Main entry is at {zim.main_entry.get_item().path}")
entry = zim.get_entry_by_path("home/fr")
print(f"Entry {entry.title} at {entry.path} is {entry.get_item().size}b.")
print(bytes(entry.get_item().content).decode("UTF-8"))

# searching using full-text index
search_string = "Welcome"
query = Query().set_query(search_string)
searcher = Searcher(zim)
search = searcher.search(query)
search_count = search.getEstimatedMatches()
print(f"there are {search_count} matches for {search_string}")
print(list(search.getResults(0, search_count)))

# accessing suggestions
search_string = "kiwix"
suggestion_searcher = SuggestionSearcher(zim)
suggestion = suggestion_searcher.suggest(search_string)
suggestion_count = suggestion.getEstimatedMatches()
print(f"there are {suggestion_count} matches for {search_string}")
print(list(suggestion.getResults(0, suggestion_count)))

Write a ZIM file

from libzim.writer import Creator, Item, StringProvider, FileProvider, Hint


class MyItem(Item):
    def __init__(self, title, path, content = "", fpath = None):
        super().__init__()
        self.path = path
        self.title = title
        self.content = content
        self.fpath = fpath

    def get_path(self):
        return self.path

    def get_title(self):
        return self.title

    def get_mimetype(self):
        return "text/html"

    def get_contentprovider(self):
        if self.fpath is not None:
            return FileProvider(self.fpath)
        return StringProvider(self.content)
       
    def get_hints(self):
        return {Hint.FRONT_ARTICLE: True}


content = """<html><head><meta charset="UTF-8"><title>Web Page Title</title></head>
<body><h1>Welcome to this ZIM</h1><p>Kiwix</p></body></html>"""

item = MyItem("Hello Kiwix", "home", content)
item2 = MyItem("Bonjour Kiwix", "home/fr", None, "home-fr.html")

with Creator("test.zim").config_indexing(True, "eng") as creator:
    creator.set_mainpath("home")
    creator.add_item(item)
    creator.add_item(item2)
    for name, value in {
        "creator": "python-libzim",
        "description": "Created in python",
        "name": "my-zim",
        "publisher": "You",
        "title": "Test ZIM",
    }.items():

        creator.add_metadata(name.title(), value)

License

GPLv3 or later, see LICENSE for more details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

libzim-2.1.0.tar.gz (8.3 MB view details)

Uploaded Source

Built Distributions

libzim-2.1.0-cp311-cp311-manylinux1_x86_64.whl (9.6 MB view details)

Uploaded CPython 3.11

libzim-2.1.0-cp311-cp311-macosx_10_9_x86_64.whl (8.0 MB view details)

Uploaded CPython 3.11 macOS 10.9+ x86-64

libzim-2.1.0-cp310-cp310-manylinux1_x86_64.whl (9.6 MB view details)

Uploaded CPython 3.10

libzim-2.1.0-cp310-cp310-macosx_10_9_x86_64.whl (7.9 MB view details)

Uploaded CPython 3.10 macOS 10.9+ x86-64

libzim-2.1.0-cp39-cp39-manylinux1_x86_64.whl (9.6 MB view details)

Uploaded CPython 3.9

libzim-2.1.0-cp39-cp39-macosx_10_9_x86_64.whl (7.8 MB view details)

Uploaded CPython 3.9 macOS 10.9+ x86-64

libzim-2.1.0-cp38-cp38-manylinux1_x86_64.whl (9.6 MB view details)

Uploaded CPython 3.8

libzim-2.1.0-cp38-cp38-macosx_10_9_x86_64.whl (7.8 MB view details)

Uploaded CPython 3.8 macOS 10.9+ x86-64

libzim-2.1.0-cp37-cp37m-manylinux1_x86_64.whl (9.5 MB view details)

Uploaded CPython 3.7m

libzim-2.1.0-cp37-cp37m-macosx_10_9_x86_64.whl (7.8 MB view details)

Uploaded CPython 3.7m macOS 10.9+ x86-64

libzim-2.1.0-cp36-cp36m-manylinux1_x86_64.whl (9.5 MB view details)

Uploaded CPython 3.6m

libzim-2.1.0-cp36-cp36m-macosx_10_9_x86_64.whl (7.8 MB view details)

Uploaded CPython 3.6m macOS 10.9+ x86-64

File details

Details for the file libzim-2.1.0.tar.gz.

File metadata

  • Download URL: libzim-2.1.0.tar.gz
  • Upload date:
  • Size: 8.3 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.8.14

File hashes

Hashes for libzim-2.1.0.tar.gz
Algorithm Hash digest
SHA256 4bd6766cc8706ca93d75dc906d2487d7ae125284f22feb4a038951e6273f8d5d
MD5 e735a54dbcf13587e8c0836f350417db
BLAKE2b-256 c886b245b5471f061368268fa10396dc3161abf9ca456313f358fb9164b60600

See more details on using hashes here.

File details

Details for the file libzim-2.1.0-cp311-cp311-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for libzim-2.1.0-cp311-cp311-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 9cc8c9ba91fcfd918362abba43bb683be7f89c6feeaf6608d257c8f7d435cbab
MD5 92bfb99ed43741d84a72461d71e8e6df
BLAKE2b-256 f6528dfd9ed244ef8d883d7383aeccc628628406cf99ca1fb3c11214e7bc6446

See more details on using hashes here.

File details

Details for the file libzim-2.1.0-cp311-cp311-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for libzim-2.1.0-cp311-cp311-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 664359a032ecc04e4954aabf29f55acb48686e7ee7608defe959720bfbb21de6
MD5 173067cce800f32ceb0f4566cf47237f
BLAKE2b-256 4524a50d1776d926c6f6b09d36687251d067f6a134ac904a309ff4b2af0731d8

See more details on using hashes here.

File details

Details for the file libzim-2.1.0-cp310-cp310-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for libzim-2.1.0-cp310-cp310-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 cad0fc3aff397e375de865156f3ce0d3e122a38e8fcc970722158a45bafe1ed3
MD5 6a8be6cc527dc4c3ac545e8edf4bac68
BLAKE2b-256 23ac93e111bc6121d15786c8649f83c99c94061a47cd28e10b405e40680ae1ce

See more details on using hashes here.

File details

Details for the file libzim-2.1.0-cp310-cp310-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for libzim-2.1.0-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 0d7e8814db311c2d71288f2a629724705e2c7bcccff74ed5d35d9b99b5a00355
MD5 e98b05433afc7c2839cf89a5a153a7ec
BLAKE2b-256 b1012be11d0240b97aa46d902e204f71f4548ffe5bfc8bf6175cf0abd8dcb7ea

See more details on using hashes here.

File details

Details for the file libzim-2.1.0-cp39-cp39-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for libzim-2.1.0-cp39-cp39-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 50ff170b251ddc68b3ab19af6fc2d5afd764a8e0bf20fd02d5173181133d3528
MD5 bc5e4fb8524212682c1851482e985969
BLAKE2b-256 167c726791d3bd2c7eb7e5e32eb96e129a54cc38db155e53b8671804c89148f2

See more details on using hashes here.

File details

Details for the file libzim-2.1.0-cp39-cp39-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for libzim-2.1.0-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 a191d2ecaabe9689dbbbdf12df7b8f8671c76bced86d0ba2effbb9ba04d76641
MD5 e747f807aed556ecde4580ce9e5c41c3
BLAKE2b-256 6f2055e95e811ced34ff3ab274c014cbb473e7dfce347ba5ca3cc45d9fde27e2

See more details on using hashes here.

File details

Details for the file libzim-2.1.0-cp38-cp38-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for libzim-2.1.0-cp38-cp38-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 0189f496541c51a11b8b4a0c584c67466d807f581ea497b12fd643d33619cadf
MD5 67abc730d5c26419b877dfe614343bdd
BLAKE2b-256 e9249788a4ca9beb22526589fb9287962242262c3e0809cd10b1dfde6889a23d

See more details on using hashes here.

File details

Details for the file libzim-2.1.0-cp38-cp38-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for libzim-2.1.0-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 83c14b7948a5dbb61ffc3fc6e1f7df1ae7654ab97753f4c1c27adfe8e652cc5d
MD5 1512b9a16da32015ae8ba5a86413f406
BLAKE2b-256 1e7c2ea04cb5c6c6f4466b928e2187c67571e14d424d1662ca33e5137386cb58

See more details on using hashes here.

File details

Details for the file libzim-2.1.0-cp37-cp37m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for libzim-2.1.0-cp37-cp37m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 11f14b7293664a24f40b2ddc30cec16d941a2e1022d5491a858d51522676311d
MD5 a707f460634fb06c396f438ae0ab8fb5
BLAKE2b-256 d0c4c7936f523bbaa296eb1d3250c88e8a0abc7678e12f2fea1ed24e2c9831fb

See more details on using hashes here.

File details

Details for the file libzim-2.1.0-cp37-cp37m-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for libzim-2.1.0-cp37-cp37m-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 ca5e92583f8f06d1c5d1f1d0edbd023c8a86fcccc30cc75df02a737ed341b5e5
MD5 63f69ff4bece1049ce3042f710bf75c6
BLAKE2b-256 5f15f5ded824178f9d13219f044c0ee83d43c8c04dd196fc858faf4d6b5a41d8

See more details on using hashes here.

File details

Details for the file libzim-2.1.0-cp36-cp36m-manylinux1_x86_64.whl.

File metadata

  • Download URL: libzim-2.1.0-cp36-cp36m-manylinux1_x86_64.whl
  • Upload date:
  • Size: 9.5 MB
  • Tags: CPython 3.6m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.9.2 readme-renderer/34.0 requests/2.27.1 requests-toolbelt/0.10.1 urllib3/1.26.13 tqdm/4.64.1 importlib-metadata/4.8.3 keyring/23.4.1 rfc3986/1.5.0 colorama/0.4.5 CPython/3.6.15

File hashes

Hashes for libzim-2.1.0-cp36-cp36m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 52318ad185785a7bad6f0d86b1657310251ecbeba750110450e4a482fff38951
MD5 4d9f6010570a138b9d884d9a5b60dab1
BLAKE2b-256 eed3552b3876a0e7381093ad6395c096bc2e39c21ec3c0c696ad0cd593008b63

See more details on using hashes here.

File details

Details for the file libzim-2.1.0-cp36-cp36m-macosx_10_9_x86_64.whl.

File metadata

  • Download URL: libzim-2.1.0-cp36-cp36m-macosx_10_9_x86_64.whl
  • Upload date:
  • Size: 7.8 MB
  • Tags: CPython 3.6m, macOS 10.9+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.9.2 readme-renderer/34.0 requests/2.27.1 requests-toolbelt/0.10.1 urllib3/1.26.13 tqdm/4.64.1 importlib-metadata/4.8.3 keyring/23.4.1 rfc3986/1.5.0 colorama/0.4.5 CPython/3.6.15

File hashes

Hashes for libzim-2.1.0-cp36-cp36m-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 956cf9f60a43b198839a14d9e35a5b339155ca3fe4ca5a8020d7b4a6a71b4e0f
MD5 25c83b5061b70623f4df0bf1eb38392a
BLAKE2b-256 1e41126f9a1848d7898b6b3cb34f8a0484eafa05e0ecaacf228d27ff8ed2bce2

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page