Skip to main content

A python-facing API for creating and interacting with ZIM files

Project description

python-libzim

libzim module allows you to read and write ZIM files in Python. It provides a shallow python interface on top of the C++ libzim library.

It is primarily used in openZIM scrapers like sotoki or youtube2zim.

Build Status CodeFactor License: GPL v3 PyPI version shields.io codecov

Installation

pip install libzim

Our PyPI wheels bundle a recent release of the C++ libzim and are available for the following platforms:

  • macOS for x86_64 and arm64
  • GNU/Linux for x86_64, armhf and aarch64
  • Linux+musl for x86_64 and aarch64

Wheels are available for both CPython and PyPy.

Users on other platforms can install the source distribution (see Building below).

Contributions

git clone git@github.com:openzim/python-libzim.git && cd python-libzim
# python -m venv env && source env/bin/activate
pip install -U setuptools invoke
invoke download-libzim install-dev build-ext test
# invoke --list for available development helpers

See CONTRIBUTING.md for additional details then Open a ticket or submit a Pull Request on Github 🤗!

Usage

Read a ZIM file

from libzim.reader import Archive
from libzim.search import Query, Searcher
from libzim.suggestion import SuggestionSearcher

zim = Archive("test.zim")
print(f"Main entry is at {zim.main_entry.get_item().path}")
entry = zim.get_entry_by_path("home/fr")
print(f"Entry {entry.title} at {entry.path} is {entry.get_item().size}b.")
print(bytes(entry.get_item().content).decode("UTF-8"))

# searching using full-text index
search_string = "Welcome"
query = Query().set_query(search_string)
searcher = Searcher(zim)
search = searcher.search(query)
search_count = search.getEstimatedMatches()
print(f"there are {search_count} matches for {search_string}")
print(list(search.getResults(0, search_count)))

# accessing suggestions
search_string = "kiwix"
suggestion_searcher = SuggestionSearcher(zim)
suggestion = suggestion_searcher.suggest(search_string)
suggestion_count = suggestion.getEstimatedMatches()
print(f"there are {suggestion_count} matches for {search_string}")
print(list(suggestion.getResults(0, suggestion_count)))

Write a ZIM file

from libzim.writer import Creator, Item, StringProvider, FileProvider, Hint


class MyItem(Item):
    def __init__(self, title, path, content = "", fpath = None):
        super().__init__()
        self.path = path
        self.title = title
        self.content = content
        self.fpath = fpath

    def get_path(self):
        return self.path

    def get_title(self):
        return self.title

    def get_mimetype(self):
        return "text/html"

    def get_contentprovider(self):
        if self.fpath is not None:
            return FileProvider(self.fpath)
        return StringProvider(self.content)

    def get_hints(self):
        return {Hint.FRONT_ARTICLE: True}


content = """<html><head><meta charset="UTF-8"><title>Web Page Title</title></head>
<body><h1>Welcome to this ZIM</h1><p>Kiwix</p></body></html>"""

item = MyItem("Hello Kiwix", "home", content)
item2 = MyItem("Bonjour Kiwix", "home/fr", None, "home-fr.html")

with Creator("test.zim").config_indexing(True, "eng") as creator:
    creator.set_mainpath("home")
    creator.add_item(item)
    creator.add_item(item2)
    for name, value in {
        "creator": "python-libzim",
        "description": "Created in python",
        "name": "my-zim",
        "publisher": "You",
        "title": "Test ZIM",
    }.items():

        creator.add_metadata(name.title(), value)

Building

libzim package building offers different behaviors via environment variables

Variable Example Use case
LIBZIM_DL_VERSION 8.1.1 or 2023-04-14 Specify the C++ libzim binary version to download and bundle. Either a release version string or a date, in which case it downloads a nightly
USE_SYSTEM_LIBZIM 1 Uses LDFLAG and CFLAGS to find the libzim to link against. Resulting wheel won't bundle C++ libzim.
DONT_DOWNLOAD_LIBZIM 1 Disable downloading of C++ libzim. Place headers in include/ and libzim dylib/so in libzim/ if no using system libzim. It will be bundled in wheel.
PROFILE 1 Enable profile tracing in Cython extension. Required for Cython code coverage reporting.
SIGN_APPLE 1 Set to sign and notarize the extension for macOS. Requires following informations
APPLE_SIGNING_IDENTITY Developer ID Application: OrgName (ID) Required for signing on macOS
APPLE_SIGNING_KEYCHAIN_PATH /tmp/build.keychain Path to the Keychain containing the certificate to sign for macOS with
APPLE_SIGNING_KEYCHAIN_PROFILE build Name of the profile in the specified Keychain

Examples

Default: downloading and bundling most appropriate libzim release binary
python3 -m build

Using system libzim (brew, debian or manually installed) - not bundled

# using system-installed C++ libzim
brew install libzim  # macOS
apt-get install libzim-devel  # debian
dnf install libzim-dev  # fedora
USE_SYSTEM_LIBZIM=1 python3 -m build --wheel

# using a specific C++ libzim
USE_SYSTEM_LIBZIM=1 \
CFLAGS="-I/usr/local/include" \
LDFLAGS="-L/usr/local/lib"
DYLD_LIBRARY_PATH="/usr/local/lib" \
LD_LIBRARY_PATH="/usr/local/lib" \
python3 -m build --wheel

Other platforms

On platforms for which there is no official binary available, you'd have to compile C++ libzim from source first then either use DONT_DOWNLOAD_LIBZIM or USE_SYSTEM_LIBZIM.

License

GPLv3 or later, see LICENSE for more details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

libzim-3.1.0.tar.gz (203.3 kB view details)

Uploaded Source

Built Distributions

libzim-3.1.0-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (9.6 MB view details)

Uploaded CPython 3.11 manylinux: glibc 2.27+ x86-64 manylinux: glibc 2.28+ x86-64

libzim-3.1.0-cp311-cp311-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl (9.3 MB view details)

Uploaded CPython 3.11 manylinux: glibc 2.24+ ARM64 manylinux: glibc 2.28+ ARM64

libzim-3.1.0-cp311-cp311-macosx_12_0_x86_64.whl (7.9 MB view details)

Uploaded CPython 3.11 macOS 12.0+ x86-64

libzim-3.1.0-cp311-cp311-macosx_11_0_arm64.whl (21.6 MB view details)

Uploaded CPython 3.11 macOS 11.0+ ARM64

libzim-3.1.0-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (9.6 MB view details)

Uploaded CPython 3.10 manylinux: glibc 2.27+ x86-64 manylinux: glibc 2.28+ x86-64

libzim-3.1.0-cp310-cp310-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl (9.3 MB view details)

Uploaded CPython 3.10 manylinux: glibc 2.24+ ARM64 manylinux: glibc 2.28+ ARM64

libzim-3.1.0-cp310-cp310-macosx_12_0_x86_64.whl (7.9 MB view details)

Uploaded CPython 3.10 macOS 12.0+ x86-64

libzim-3.1.0-cp310-cp310-macosx_11_0_arm64.whl (21.6 MB view details)

Uploaded CPython 3.10 macOS 11.0+ ARM64

libzim-3.1.0-cp39-cp39-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (9.6 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.27+ x86-64 manylinux: glibc 2.28+ x86-64

libzim-3.1.0-cp39-cp39-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl (9.3 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.24+ ARM64 manylinux: glibc 2.28+ ARM64

libzim-3.1.0-cp39-cp39-macosx_12_0_x86_64.whl (7.9 MB view details)

Uploaded CPython 3.9 macOS 12.0+ x86-64

libzim-3.1.0-cp39-cp39-macosx_11_0_arm64.whl (21.6 MB view details)

Uploaded CPython 3.9 macOS 11.0+ ARM64

libzim-3.1.0-cp38-cp38-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (9.6 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.27+ x86-64 manylinux: glibc 2.28+ x86-64

libzim-3.1.0-cp38-cp38-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl (9.3 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.24+ ARM64 manylinux: glibc 2.28+ ARM64

libzim-3.1.0-cp38-cp38-macosx_12_0_x86_64.whl (7.9 MB view details)

Uploaded CPython 3.8 macOS 12.0+ x86-64

libzim-3.1.0-cp38-cp38-macosx_11_0_arm64.whl (21.6 MB view details)

Uploaded CPython 3.8 macOS 11.0+ ARM64

libzim-3.1.0-cp37-cp37m-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (9.5 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.27+ x86-64 manylinux: glibc 2.28+ x86-64

libzim-3.1.0-cp37-cp37m-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl (9.2 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.24+ ARM64 manylinux: glibc 2.28+ ARM64

libzim-3.1.0-cp37-cp37m-macosx_12_0_x86_64.whl (7.9 MB view details)

Uploaded CPython 3.7m macOS 12.0+ x86-64

File details

Details for the file libzim-3.1.0.tar.gz.

File metadata

  • Download URL: libzim-3.1.0.tar.gz
  • Upload date:
  • Size: 203.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.16

File hashes

Hashes for libzim-3.1.0.tar.gz
Algorithm Hash digest
SHA256 035e48adc46d3d631ca295409f64af34ac25213b8ab18707c2df3507e334b361
MD5 dfb56b3e1818a201a9f55bf9eae69180
BLAKE2b-256 451a129c826240f7965173f3c66cfc31aa3da97e4a3eb667572bb31a8cb3d9d2

See more details on using hashes here.

File details

Details for the file libzim-3.1.0-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for libzim-3.1.0-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 3685d79118a1e019003efde7689bf0f3ce75af76475a9b0ce2147f9ccd79a3df
MD5 bcffba421bf28af99f2664853c3167c2
BLAKE2b-256 80d24f5d770766d0ad319eaeccb7b9f3d92b37f99ef71eba5e90429c734f4856

See more details on using hashes here.

File details

Details for the file libzim-3.1.0-cp311-cp311-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for libzim-3.1.0-cp311-cp311-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 31fd9878e054f5bd7e91213af0ce96cc4440dca1d68803a9f75e5a6bceea924a
MD5 79e739814fbbe758406cc007738d6bfa
BLAKE2b-256 049d984d5555adb43defc48a54cf6f2461c84347c3f724bce2c6c75e8dc4a22d

See more details on using hashes here.

File details

Details for the file libzim-3.1.0-cp311-cp311-macosx_12_0_x86_64.whl.

File metadata

File hashes

Hashes for libzim-3.1.0-cp311-cp311-macosx_12_0_x86_64.whl
Algorithm Hash digest
SHA256 0dc8401439de0cd63e23c94fef7ded69bed1cf872dc061592db9bf252ba85ec4
MD5 eff1fc6fb33fb3d35a43b4447db1cb0e
BLAKE2b-256 7cd1826a8492a83b3685494cc3a98c7124424be707f11b79876c259583099653

See more details on using hashes here.

File details

Details for the file libzim-3.1.0-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for libzim-3.1.0-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 01f21a2c0651a797c2fa397a62424bd1539f01ea99028f9bbd0328cd3a1a0a98
MD5 75cc4b748856d74193a23c6d732f8e4f
BLAKE2b-256 20c26b31dde01b13abbb9798d7ad88c4e74b0b3c1fbe2fdbe32b957b598abfdf

See more details on using hashes here.

File details

Details for the file libzim-3.1.0-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for libzim-3.1.0-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 33277e61affae69bfe52f4d7a7179c9a9af891114e2ec8e69dcb9e7cd6ba14ad
MD5 64dfb0017bc2e37ef687316db68b8eae
BLAKE2b-256 70f86fada4b46c235ab7dcbd6a0e6e98f51cbed36efe50f215cb62ce275d81c3

See more details on using hashes here.

File details

Details for the file libzim-3.1.0-cp310-cp310-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for libzim-3.1.0-cp310-cp310-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 600b00fb92ad521cadf701d0cdbbda4a38c8e4ce684d127c5c8ff84c08c45142
MD5 9501e61de70ff7209d419d4f88edf3a0
BLAKE2b-256 ff1180d834e45226cd123b912ff56552112b728bd6ade78a0e3b9c9c6f910b9b

See more details on using hashes here.

File details

Details for the file libzim-3.1.0-cp310-cp310-macosx_12_0_x86_64.whl.

File metadata

File hashes

Hashes for libzim-3.1.0-cp310-cp310-macosx_12_0_x86_64.whl
Algorithm Hash digest
SHA256 aac6e8dd30ed48c0bf6f7c4c5802922ff231e21f50120d25e93b2e6fc7b3e746
MD5 e1680cb4f338c2193bc9e2d471037ca2
BLAKE2b-256 955107c04bc8e0f57294d23ca8c04fffbc50b273a110496902ecb6e279e3ca02

See more details on using hashes here.

File details

Details for the file libzim-3.1.0-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for libzim-3.1.0-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 99e50269a3e7f50bfc272cfcbe90931f0891c0665990c8ac2303611bce11ae38
MD5 0e52c44eee69ac0058e6c23de6ab6cb1
BLAKE2b-256 0de288fe720e96d5f53bfedfcf4672df61f8db561c13e30038a361e45ebd011e

See more details on using hashes here.

File details

Details for the file libzim-3.1.0-cp39-cp39-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for libzim-3.1.0-cp39-cp39-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 c19aeb8bc6c8a39c588e4b615e60efe23e3c6402a53026fa90317fece56ccbca
MD5 c258bfbd6af91c3ff8f12b3844c9eb46
BLAKE2b-256 83908288e2a7235d76be49740a31d86b499751b186fd9e1ecd0c7cd2230c58a8

See more details on using hashes here.

File details

Details for the file libzim-3.1.0-cp39-cp39-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for libzim-3.1.0-cp39-cp39-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 e78a394e4f4aebebc9b12ecaf7f06598899cac1b89e297abf41d713c13b6d367
MD5 dc3e1bb6843dfe8da9b67e1c0aa4d97b
BLAKE2b-256 3429cc321dae304fffd58ea2577fae7ab31a9cae60b2481b8c26ae885610d20c

See more details on using hashes here.

File details

Details for the file libzim-3.1.0-cp39-cp39-macosx_12_0_x86_64.whl.

File metadata

File hashes

Hashes for libzim-3.1.0-cp39-cp39-macosx_12_0_x86_64.whl
Algorithm Hash digest
SHA256 d56d29ea4b5764be04951b7e88b4860b92f59895d4479cf5e30493ecb4ef1923
MD5 dffaaf26a28050f7fdc8e527cb1525bd
BLAKE2b-256 1eb0f5239491e2f8c4f1eeb4fe43ab8e8b45e5c818d1a20754387a9c0ff27ff6

See more details on using hashes here.

File details

Details for the file libzim-3.1.0-cp39-cp39-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for libzim-3.1.0-cp39-cp39-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 224e528eb28cee90859f3672699867d33e3551a6626408f77de3943a2573351a
MD5 829d73215db53c40d206a266ee31d867
BLAKE2b-256 3ed090a427d1a009e339d60536c4df9ee523ca3aed65acd0762d75165fb79fa6

See more details on using hashes here.

File details

Details for the file libzim-3.1.0-cp38-cp38-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for libzim-3.1.0-cp38-cp38-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 f7966ed780dbea3c618bfad81fa9886376d84f9f560610d2953d392732a08f23
MD5 dcc32b2e6dd75b3f3b4103035ef6acca
BLAKE2b-256 1f073e657fd7ba079da26a01259d54d6e62af96f786d8b537ea95679939ecc9b

See more details on using hashes here.

File details

Details for the file libzim-3.1.0-cp38-cp38-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for libzim-3.1.0-cp38-cp38-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 03b9c32646c62cfcf1f2221029e70ddfcff0f04701b2ca6ba28411bcdba13a4a
MD5 4c3833e8e5e4772d4883ac7ff8d6fc18
BLAKE2b-256 94ec75403b28af804a3cfd06892e1ac7ac2daef0b0833e4dd6a4c19d20431e98

See more details on using hashes here.

File details

Details for the file libzim-3.1.0-cp38-cp38-macosx_12_0_x86_64.whl.

File metadata

File hashes

Hashes for libzim-3.1.0-cp38-cp38-macosx_12_0_x86_64.whl
Algorithm Hash digest
SHA256 14b175ac4a497e1fe566ca47afea2e3059f5f60a472427d3e95ded0dd9305f3d
MD5 55141edfc25689229e6905c2cd6d92d4
BLAKE2b-256 89b09778652ed31ff5c2f92448548a4f264f7bdba7d84d79b55ceacb6868e2bc

See more details on using hashes here.

File details

Details for the file libzim-3.1.0-cp38-cp38-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for libzim-3.1.0-cp38-cp38-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 c763661668b18f5c602d0bc9a6d7d4800d630bf041f53d79c8dc283ee4039f3e
MD5 fc6787a0050f4a3449691b403ca34570
BLAKE2b-256 cdefd4f7a32bb8ed1e43cc96273734df4f0d8a523bf705778d6eb5d9a75f1fd6

See more details on using hashes here.

File details

Details for the file libzim-3.1.0-cp37-cp37m-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for libzim-3.1.0-cp37-cp37m-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 6c26766621225408e2361c832355715620145f8d87a9bcae4fcfd42900834ca0
MD5 4075efc84c4bc65597ace67162868b4c
BLAKE2b-256 b6be541e213797867765846809af3138a550538acfa5fe028058506acdd83cae

See more details on using hashes here.

File details

Details for the file libzim-3.1.0-cp37-cp37m-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for libzim-3.1.0-cp37-cp37m-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 c1b30d83b124a7426f7893dd53c7bd0c0cea0d09d1a8b3489d6a4cf3b763ead8
MD5 af0d0befdfdaffea093dd086cd897a97
BLAKE2b-256 f325af9b2c60d3d62e1ce1d927b1efb452d637a4e55dc2bdcead9a5bb731249a

See more details on using hashes here.

File details

Details for the file libzim-3.1.0-cp37-cp37m-macosx_12_0_x86_64.whl.

File metadata

File hashes

Hashes for libzim-3.1.0-cp37-cp37m-macosx_12_0_x86_64.whl
Algorithm Hash digest
SHA256 a2c8c1381c7883ed08fcf46305f6d32770b80f4e31efb3bd0c38356c0f931f47
MD5 dde7694c6a2d4130f29c35d5ec4eef2b
BLAKE2b-256 aa241845479005ae604fadd08c21dd385798c4fc8796698b197b132f2ba5373a

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page