Skip to main content

A python-facing API for creating and interacting with ZIM files

Project description

python-libzim

libzim module allows you to read and write ZIM files in Python. It provides a shallow python interface on top of the C++ libzim library.

It is primarily used in openZIM scrapers like sotoki or youtube2zim.

Build Status CodeFactor License: GPL v3 PyPI version shields.io codecov

Installation

pip install libzim

Our PyPI wheels bundle a recent release of the C++ libzim and are available for the following platforms:

  • macOS for x86_64 and arm64
  • GNU/Linux for x86_64, armhf and aarch64
  • Linux+musl for x86_64 and aarch64

Wheels are available for both CPython and PyPy.

Users on other platforms can install the source distribution (see Building below).

Contributions

git clone git@github.com:openzim/python-libzim.git && cd python-libzim
# python -m venv env && source env/bin/activate
pip install -U setuptools invoke
invoke download-libzim install-dev build-ext test
# invoke --list for available development helpers

See CONTRIBUTING.md for additional details then Open a ticket or submit a Pull Request on Github 🤗!

Usage

Read a ZIM file

from libzim.reader import Archive
from libzim.search import Query, Searcher
from libzim.suggestion import SuggestionSearcher

zim = Archive("test.zim")
print(f"Main entry is at {zim.main_entry.get_item().path}")
entry = zim.get_entry_by_path("home/fr")
print(f"Entry {entry.title} at {entry.path} is {entry.get_item().size}b.")
print(bytes(entry.get_item().content).decode("UTF-8"))

# searching using full-text index
search_string = "Welcome"
query = Query().set_query(search_string)
searcher = Searcher(zim)
search = searcher.search(query)
search_count = search.getEstimatedMatches()
print(f"there are {search_count} matches for {search_string}")
print(list(search.getResults(0, search_count)))

# accessing suggestions
search_string = "kiwix"
suggestion_searcher = SuggestionSearcher(zim)
suggestion = suggestion_searcher.suggest(search_string)
suggestion_count = suggestion.getEstimatedMatches()
print(f"there are {suggestion_count} matches for {search_string}")
print(list(suggestion.getResults(0, suggestion_count)))

Write a ZIM file

from libzim.writer import Creator, Item, StringProvider, FileProvider, Hint


class MyItem(Item):
    def __init__(self, title, path, content = "", fpath = None):
        super().__init__()
        self.path = path
        self.title = title
        self.content = content
        self.fpath = fpath

    def get_path(self):
        return self.path

    def get_title(self):
        return self.title

    def get_mimetype(self):
        return "text/html"

    def get_contentprovider(self):
        if self.fpath is not None:
            return FileProvider(self.fpath)
        return StringProvider(self.content)

    def get_hints(self):
        return {Hint.FRONT_ARTICLE: True}


content = """<html><head><meta charset="UTF-8"><title>Web Page Title</title></head>
<body><h1>Welcome to this ZIM</h1><p>Kiwix</p></body></html>"""

item = MyItem("Hello Kiwix", "home", content)
item2 = MyItem("Bonjour Kiwix", "home/fr", None, "home-fr.html")

with Creator("test.zim").config_indexing(True, "eng") as creator:
    creator.set_mainpath("home")
    creator.add_item(item)
    creator.add_item(item2)
    for name, value in {
        "creator": "python-libzim",
        "description": "Created in python",
        "name": "my-zim",
        "publisher": "You",
        "title": "Test ZIM",
    }.items():

        creator.add_metadata(name.title(), value)

Building

libzim package building offers different behaviors via environment variables

Variable Example Use case
LIBZIM_DL_VERSION 8.1.1 or 2023-04-14 Specify the C++ libzim binary version to download and bundle. Either a release version string or a date, in which case it downloads a nightly
USE_SYSTEM_LIBZIM 1 Uses LDFLAG and CFLAGS to find the libzim to link against. Resulting wheel won't bundle C++ libzim.
DONT_DOWNLOAD_LIBZIM 1 Disable downloading of C++ libzim. Place headers in include/ and libzim dylib/so in libzim/ if no using system libzim. It will be bundled in wheel.
PROFILE 1 Enable profile tracing in Cython extension. Required for Cython code coverage reporting.
SIGN_APPLE 1 Set to sign and notarize the extension for macOS. Requires following informations
APPLE_SIGNING_IDENTITY Developer ID Application: OrgName (ID) Required for signing on macOS
APPLE_SIGNING_KEYCHAIN_PATH /tmp/build.keychain Path to the Keychain containing the certificate to sign for macOS with
APPLE_SIGNING_KEYCHAIN_PROFILE build Name of the profile in the specified Keychain

Examples

Default: downloading and bundling most appropriate libzim release binary
python3 -m build

Using system libzim (brew, debian or manually installed) - not bundled

# using system-installed C++ libzim
brew install libzim  # macOS
apt-get install libzim-devel  # debian
dnf install libzim-dev  # fedora
USE_SYSTEM_LIBZIM=1 python3 -m build --wheel

# using a specific C++ libzim
USE_SYSTEM_LIBZIM=1 \
CFLAGS="-I/usr/local/include" \
LDFLAGS="-L/usr/local/lib"
DYLD_LIBRARY_PATH="/usr/local/lib" \
LD_LIBRARY_PATH="/usr/local/lib" \
python3 -m build --wheel

Other platforms

On platforms for which there is no official binary available, you'd have to compile C++ libzim from source first then either use DONT_DOWNLOAD_LIBZIM or USE_SYSTEM_LIBZIM.

License

GPLv3 or later, see LICENSE for more details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

libzim-3.3.0.tar.gz (255.9 kB view details)

Uploaded Source

Built Distributions

libzim-3.3.0-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (8.1 MB view details)

Uploaded CPython 3.12 manylinux: glibc 2.27+ x86-64 manylinux: glibc 2.28+ x86-64

libzim-3.3.0-cp312-cp312-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl (7.8 MB view details)

Uploaded CPython 3.12 manylinux: glibc 2.24+ ARM64 manylinux: glibc 2.28+ ARM64

libzim-3.3.0-cp312-cp312-macosx_13_0_x86_64.whl (6.2 MB view details)

Uploaded CPython 3.12 macOS 13.0+ x86-64

libzim-3.3.0-cp312-cp312-macosx_12_0_arm64.whl (5.8 MB view details)

Uploaded CPython 3.12 macOS 12.0+ ARM64

libzim-3.3.0-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (8.2 MB view details)

Uploaded CPython 3.11 manylinux: glibc 2.27+ x86-64 manylinux: glibc 2.28+ x86-64

libzim-3.3.0-cp311-cp311-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl (7.9 MB view details)

Uploaded CPython 3.11 manylinux: glibc 2.24+ ARM64 manylinux: glibc 2.28+ ARM64

libzim-3.3.0-cp311-cp311-macosx_13_0_x86_64.whl (6.2 MB view details)

Uploaded CPython 3.11 macOS 13.0+ x86-64

libzim-3.3.0-cp311-cp311-macosx_12_0_arm64.whl (5.8 MB view details)

Uploaded CPython 3.11 macOS 12.0+ ARM64

libzim-3.3.0-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (8.1 MB view details)

Uploaded CPython 3.10 manylinux: glibc 2.27+ x86-64 manylinux: glibc 2.28+ x86-64

libzim-3.3.0-cp310-cp310-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl (7.8 MB view details)

Uploaded CPython 3.10 manylinux: glibc 2.24+ ARM64 manylinux: glibc 2.28+ ARM64

libzim-3.3.0-cp310-cp310-macosx_13_0_x86_64.whl (6.2 MB view details)

Uploaded CPython 3.10 macOS 13.0+ x86-64

libzim-3.3.0-cp310-cp310-macosx_12_0_arm64.whl (5.8 MB view details)

Uploaded CPython 3.10 macOS 12.0+ ARM64

libzim-3.3.0-cp39-cp39-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (8.1 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.27+ x86-64 manylinux: glibc 2.28+ x86-64

libzim-3.3.0-cp39-cp39-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl (7.8 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.24+ ARM64 manylinux: glibc 2.28+ ARM64

libzim-3.3.0-cp39-cp39-macosx_13_0_x86_64.whl (6.2 MB view details)

Uploaded CPython 3.9 macOS 13.0+ x86-64

libzim-3.3.0-cp39-cp39-macosx_12_0_arm64.whl (5.8 MB view details)

Uploaded CPython 3.9 macOS 12.0+ ARM64

libzim-3.3.0-cp38-cp38-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (8.1 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.27+ x86-64 manylinux: glibc 2.28+ x86-64

libzim-3.3.0-cp38-cp38-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl (7.8 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.24+ ARM64 manylinux: glibc 2.28+ ARM64

libzim-3.3.0-cp38-cp38-macosx_13_0_x86_64.whl (6.2 MB view details)

Uploaded CPython 3.8 macOS 13.0+ x86-64

libzim-3.3.0-cp38-cp38-macosx_12_0_arm64.whl (5.8 MB view details)

Uploaded CPython 3.8 macOS 12.0+ ARM64

File details

Details for the file libzim-3.3.0.tar.gz.

File metadata

  • Download URL: libzim-3.3.0.tar.gz
  • Upload date:
  • Size: 255.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.6

File hashes

Hashes for libzim-3.3.0.tar.gz
Algorithm Hash digest
SHA256 5e1a923d9b402b18a601a7683416a189ae15600f523cff558ccde040700cd337
MD5 6aca89fb91c0780829c096c1b1b95e9b
BLAKE2b-256 65e29a1821e1cfb653d095e7651e8f876e5c8bb7c5f3cbe0f43eb1a7970acbad

See more details on using hashes here.

File details

Details for the file libzim-3.3.0-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for libzim-3.3.0-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 a9eb28c2baf28cee7b416ccdb727af44ec07bfc40323adabb852e1b01385f4c5
MD5 61de99eb97be21795644d26eb2dd5016
BLAKE2b-256 2f5c0cbfadbdd5e3a052b40f42813a26817439c4373d020de8e591d46cd86578

See more details on using hashes here.

File details

Details for the file libzim-3.3.0-cp312-cp312-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for libzim-3.3.0-cp312-cp312-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 f9209b5b2a876465565f1ea8303d702bc8a658eb4c9e1a1c3b9228b02442d814
MD5 74d16bb6631456e8f406de060ae7d37d
BLAKE2b-256 a2bd118a2e860142f3fb135ea92b9582e8d95407c479a0fa0dc26ba5957a9745

See more details on using hashes here.

File details

Details for the file libzim-3.3.0-cp312-cp312-macosx_13_0_x86_64.whl.

File metadata

File hashes

Hashes for libzim-3.3.0-cp312-cp312-macosx_13_0_x86_64.whl
Algorithm Hash digest
SHA256 ad4e6951f61774501a055d65974b7c3ec00f39f5204118322f3ae5098b95627f
MD5 294f63fa02b05f63a5378f3799a75677
BLAKE2b-256 c6259542326e2ae94c538205e20c6e088daf262077deef2bf84457370aac9275

See more details on using hashes here.

File details

Details for the file libzim-3.3.0-cp312-cp312-macosx_12_0_arm64.whl.

File metadata

File hashes

Hashes for libzim-3.3.0-cp312-cp312-macosx_12_0_arm64.whl
Algorithm Hash digest
SHA256 34e42e896307f5289e6d65e946c0cfe8094c74c15933c707e8935d9b903c7fbf
MD5 e7d6c958261439778fd6cc52bb7e2348
BLAKE2b-256 900ad2a28ef147179669b5dcaf35799993a15feb83275292e6d9533ba0988bef

See more details on using hashes here.

File details

Details for the file libzim-3.3.0-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for libzim-3.3.0-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 c47922df765801068371b25954f752ff7243ba803c1e26141901fb4f033357ad
MD5 0a1d60d441ee46d1d512815aa0ffa2be
BLAKE2b-256 ebcecd3d0ef9ca2e70af775ec6a296794c01db8f9c57f4b2a8312acfd3a2ef96

See more details on using hashes here.

File details

Details for the file libzim-3.3.0-cp311-cp311-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for libzim-3.3.0-cp311-cp311-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 151cc5bc71f38b97f54b3dfd951a93eb4a89ca7a374a1c0b296f2ba06346a987
MD5 cdca9f47f2982b70c23af8a5ff8a9572
BLAKE2b-256 bae11fdf2de582348da396d03954f9e7e0309157ae67e5df2b274974fbd3a42a

See more details on using hashes here.

File details

Details for the file libzim-3.3.0-cp311-cp311-macosx_13_0_x86_64.whl.

File metadata

File hashes

Hashes for libzim-3.3.0-cp311-cp311-macosx_13_0_x86_64.whl
Algorithm Hash digest
SHA256 db9cfff3d8145d5361b241fdf2187f38152a2d1e45bbd9c6039170245c09c2f6
MD5 9dd3d60078a0d08eb96ec7a8a31cb371
BLAKE2b-256 84d50af7aea07c39055e2799ee7e4e7f5626111677c9a9306a73327a319943b2

See more details on using hashes here.

File details

Details for the file libzim-3.3.0-cp311-cp311-macosx_12_0_arm64.whl.

File metadata

File hashes

Hashes for libzim-3.3.0-cp311-cp311-macosx_12_0_arm64.whl
Algorithm Hash digest
SHA256 5254422bc6664577a17a49833493318d6c9eee8b507d3d268b2d727d6b555ae9
MD5 753ff48d7e92ad100587f6138f7ff249
BLAKE2b-256 ae2789bfd31a38f7a1998183305507690d209d63286947935742c200c81163aa

See more details on using hashes here.

File details

Details for the file libzim-3.3.0-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for libzim-3.3.0-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 a9b2f6809f6107672402e476a5ba4a96df4f413ee5441158e6b4d72baa63ab41
MD5 882bb8f6b6985328e155d60ec87b09d7
BLAKE2b-256 d394c57939d54c7b057cd46c1a7e1f4a17fb90a620e4472d46bd0115ecbb2cb0

See more details on using hashes here.

File details

Details for the file libzim-3.3.0-cp310-cp310-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for libzim-3.3.0-cp310-cp310-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 81b00d9adb757e0e7b530a5de871432e905526870c13d6abad99dde599a41f3f
MD5 2fd3b725aea81bbfc530772334d63026
BLAKE2b-256 e0482f8b77e154da902d5ddf7bb95fd6059ee45d32ade67840fe584e0632f0cc

See more details on using hashes here.

File details

Details for the file libzim-3.3.0-cp310-cp310-macosx_13_0_x86_64.whl.

File metadata

File hashes

Hashes for libzim-3.3.0-cp310-cp310-macosx_13_0_x86_64.whl
Algorithm Hash digest
SHA256 e7f0bb6b0dfc33fd8e42cc89bd067ff2010cf2d3b028cba2589c649b8ba79d9b
MD5 ca5b3961735a54721c175468dae17b90
BLAKE2b-256 7d55aee4bdd87f804e2c5070545051798decddef9643df1db3a69c89ef91bcc6

See more details on using hashes here.

File details

Details for the file libzim-3.3.0-cp310-cp310-macosx_12_0_arm64.whl.

File metadata

File hashes

Hashes for libzim-3.3.0-cp310-cp310-macosx_12_0_arm64.whl
Algorithm Hash digest
SHA256 f7fe71bd1aaf4595635c9a806e838271cec4fbc7e3d6dce7eb5e87c189357a59
MD5 1f1b67d6305e1212c882f0239cda14ca
BLAKE2b-256 63078b0dbd7765c9839925aa29f0c9548126215c7348a23d07b91f45d2251e65

See more details on using hashes here.

File details

Details for the file libzim-3.3.0-cp39-cp39-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for libzim-3.3.0-cp39-cp39-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 8a831b23d2701f9eb9fd5f1f4394e907947338f56caa5ceec76e6b720e812ac4
MD5 5e52c3260a4f5f3dd1c16812341166bb
BLAKE2b-256 1fea230a8f7da7138ca5d46cea04f9b5d4bffb003675b1bb544389cce25e1d86

See more details on using hashes here.

File details

Details for the file libzim-3.3.0-cp39-cp39-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for libzim-3.3.0-cp39-cp39-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 12e80635950a9ccfa750e3d01e26ac973ea7b22573c4e396573d0dde421e4b90
MD5 113b811a22dae057fb2944b4f538ff56
BLAKE2b-256 2c1c771878916668ad0028c6a2ae035793dcaf5c69455d7847001e8eda12cf35

See more details on using hashes here.

File details

Details for the file libzim-3.3.0-cp39-cp39-macosx_13_0_x86_64.whl.

File metadata

File hashes

Hashes for libzim-3.3.0-cp39-cp39-macosx_13_0_x86_64.whl
Algorithm Hash digest
SHA256 79308942cfd7990d971f24bb40ec957ccf4d52f0d088eb929121aa066787f6ce
MD5 7a47eb313679648e8ffe376fbec19172
BLAKE2b-256 a2f40abd3ced509efe5652e8b64a516455d3f0d053796df80d6c48fe5a4efef3

See more details on using hashes here.

File details

Details for the file libzim-3.3.0-cp39-cp39-macosx_12_0_arm64.whl.

File metadata

File hashes

Hashes for libzim-3.3.0-cp39-cp39-macosx_12_0_arm64.whl
Algorithm Hash digest
SHA256 92b52e4b3fba1e9b9bd388a106fd4652593b48cba4d123f9882604151a86b934
MD5 eb6a646ee9a5153a47c61241ba8d10c8
BLAKE2b-256 3828300177787fc1a52d049451c28ef759ac37e41d8f649113b4896c72852775

See more details on using hashes here.

File details

Details for the file libzim-3.3.0-cp38-cp38-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for libzim-3.3.0-cp38-cp38-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 0ec8332b265941469f2b3110d0acc2d73a6c8ee3c84937e34593ad45dcbcd2b8
MD5 819cc7a315adfd7b29d0b50b1d0ac50d
BLAKE2b-256 8490a02372648ca8f7dec706e4ddf6a535f54f50d4352aba6fc79bcf467424da

See more details on using hashes here.

File details

Details for the file libzim-3.3.0-cp38-cp38-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for libzim-3.3.0-cp38-cp38-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 c65ab47a5777f1945a430556ce880e37136298201af2c74077009348cce959b9
MD5 211a09f9b731da8a8ff8b8dfde10eb4b
BLAKE2b-256 ff4843e816c0b669cb7601a92130693b46e01678b49e53f91893f584e4487f73

See more details on using hashes here.

File details

Details for the file libzim-3.3.0-cp38-cp38-macosx_13_0_x86_64.whl.

File metadata

File hashes

Hashes for libzim-3.3.0-cp38-cp38-macosx_13_0_x86_64.whl
Algorithm Hash digest
SHA256 7fd058ebb9a7e7890ede846cdc110700a161b13e7ff67053b4dac90ba709e7f0
MD5 d5e1502e8a4d265b946c77480512ca11
BLAKE2b-256 cb4af6710a540c5a738d0fc0e5fb91636ce232ec44c4ee34400cf31ffdfa0951

See more details on using hashes here.

File details

Details for the file libzim-3.3.0-cp38-cp38-macosx_12_0_arm64.whl.

File metadata

File hashes

Hashes for libzim-3.3.0-cp38-cp38-macosx_12_0_arm64.whl
Algorithm Hash digest
SHA256 64bac298d40202dbc98220efa7c72734658e22d6ad822f31e4506eee49d27b8c
MD5 eed0f0ea6b7b836d61d7e4ed4ff7559d
BLAKE2b-256 a2aced1cd223f2a7759a735b20293ff6b479dccbe57ea2be84daaaddc6223d01

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page