Skip to main content

A python-facing API for creating and interacting with ZIM files

Project description

python-libzim

libzim module allows you to read and write ZIM files in Python. It provides a shallow python interface on top of the C++ libzim library.

It is primarily used in openZIM scrapers like sotoki or youtube2zim.

Build Status CodeFactor License: GPL v3 PyPI version shields.io codecov

Installation

pip install libzim

The PyPI package is available for x86_64 macOS and GNU/Linux only. It bundles a recent release of the C++ libzim.

On other platforms, you'd have to compile C++ libzim from source first then build this one, adjusting LD_LIBRARY_PATH.

Contributions

git clone git@github.com:openzim/python-libzim.git && cd python-libzim
# python -m venv env && source env/bin/activate
pip install -U setuptools invoke
invoke download-libzim install-dev build-ext test
# invoke --list for available development helpers

See CONTRIBUTING.md for additional details then Open a ticket or submit a Pull Request on Github 🤗!

Usage

Read a ZIM file

from libzim.reader import Archive
from libzim.search import Query, Searcher
from libzim.suggestion import SuggestionSearcher

zim = Archive("test.zim")
print(f"Main entry is at {zim.main_entry.get_item().path}")
entry = zim.get_entry_by_path("home/fr")
print(f"Entry {entry.title} at {entry.path} is {entry.get_item().size}b.")
print(bytes(entry.get_item().content).decode("UTF-8"))

# searching using full-text index
search_string = "Welcome"
query = Query().set_query(search_string)
searcher = Searcher(zim)
search = searcher.search(query)
search_count = search.getEstimatedMatches()
print(f"there are {search_count} matches for {search_string}")
print(list(search.getResults(0, search_count)))

# accessing suggestions
search_string = "kiwix"
suggestion_searcher = SuggestionSearcher(zim)
suggestion = suggestion_searcher.suggest(search_string)
suggestion_count = suggestion.getEstimatedMatches()
print(f"there are {suggestion_count} matches for {search_string}")
print(list(suggestion.getResults(0, suggestion_count)))

Write a ZIM file

from libzim.writer import Creator, Item, StringProvider, FileProvider, Hint


class MyItem(Item):
    def __init__(self, title, path, content = "", fpath = None):
        super().__init__()
        self.path = path
        self.title = title
        self.content = content
        self.fpath = fpath

    def get_path(self):
        return self.path

    def get_title(self):
        return self.title

    def get_mimetype(self):
        return "text/html"

    def get_contentprovider(self):
        if self.fpath is not None:
            return FileProvider(self.fpath)
        return StringProvider(self.content)
       
    def get_hints(self):
        return {Hint.FRONT_ARTICLE: True}


content = """<html><head><meta charset="UTF-8"><title>Web Page Title</title></head>
<body><h1>Welcome to this ZIM</h1><p>Kiwix</p></body></html>"""

item = MyItem("Hello Kiwix", "home", content)
item2 = MyItem("Bonjour Kiwix", "home/fr", None, "home-fr.html")

with Creator("test.zim").config_indexing(True, "eng") as creator:
    creator.set_mainpath("home")
    creator.add_item(item)
    creator.add_item(item2)
    for name, value in {
        "creator": "python-libzim",
        "description": "Created in python",
        "name": "my-zim",
        "publisher": "You",
        "title": "Test ZIM",
    }.items():

        creator.add_metadata(name.title(), value)

License

GPLv3 or later, see LICENSE for more details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

libzim-2.0.0.tar.gz (187.2 kB view details)

Uploaded Source

Built Distributions

libzim-2.0.0-cp310-cp310-manylinux1_x86_64.whl (9.4 MB view details)

Uploaded CPython 3.10

libzim-2.0.0-cp310-cp310-macosx_10_9_x86_64.whl (7.8 MB view details)

Uploaded CPython 3.10 macOS 10.9+ x86-64

libzim-2.0.0-cp39-cp39-manylinux1_x86_64.whl (9.4 MB view details)

Uploaded CPython 3.9

libzim-2.0.0-cp39-cp39-macosx_10_9_x86_64.whl (7.8 MB view details)

Uploaded CPython 3.9 macOS 10.9+ x86-64

libzim-2.0.0-cp38-cp38-manylinux1_x86_64.whl (9.5 MB view details)

Uploaded CPython 3.8

libzim-2.0.0-cp38-cp38-macosx_10_9_x86_64.whl (7.8 MB view details)

Uploaded CPython 3.8 macOS 10.9+ x86-64

libzim-2.0.0-cp37-cp37m-manylinux1_x86_64.whl (9.4 MB view details)

Uploaded CPython 3.7m

libzim-2.0.0-cp37-cp37m-macosx_10_9_x86_64.whl (7.8 MB view details)

Uploaded CPython 3.7m macOS 10.9+ x86-64

libzim-2.0.0-cp36-cp36m-manylinux1_x86_64.whl (9.4 MB view details)

Uploaded CPython 3.6m

libzim-2.0.0-cp36-cp36m-macosx_10_9_x86_64.whl (7.8 MB view details)

Uploaded CPython 3.6m macOS 10.9+ x86-64

File details

Details for the file libzim-2.0.0.tar.gz.

File metadata

  • Download URL: libzim-2.0.0.tar.gz
  • Upload date:
  • Size: 187.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.8.13

File hashes

Hashes for libzim-2.0.0.tar.gz
Algorithm Hash digest
SHA256 2f8747ca80efa19322021c7a490bdd2a1bad98cbeda79cb1fefdf0b1a5c8ae7e
MD5 b96d09b82474fe2e335bd590be48b3c8
BLAKE2b-256 d1b6892c9614ce053ad87a5c92c5e8a584f62a21704cd0748eb2bd94fff1bd7d

See more details on using hashes here.

File details

Details for the file libzim-2.0.0-cp310-cp310-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for libzim-2.0.0-cp310-cp310-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 4ec9e2a20ae05c620bb788289f8a3c147587d48f9b795e2d24965c74243fc91e
MD5 b4412b3fd52109787f7fab68fa91280b
BLAKE2b-256 c11ac0ce9d2e05d5e6888e05fa7cb5d660398ef1bd31787189f0536cff324624

See more details on using hashes here.

File details

Details for the file libzim-2.0.0-cp310-cp310-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for libzim-2.0.0-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 b9d0b867d6fc4ab41aaf2b7a6e347125c66beb24b9e889eaba6caf1c704becb7
MD5 2e9954c3fccccf49b246ff0a928d546a
BLAKE2b-256 a45f1644063406a49f17f26d9b4183f7d85fb5ecf1b7a1e257f4ac96e3497bb3

See more details on using hashes here.

File details

Details for the file libzim-2.0.0-cp39-cp39-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for libzim-2.0.0-cp39-cp39-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 64ad727ef4d15dc2ecbdab62b3cc39b50a4df1766a4b67d03fe23860581d90fc
MD5 410342a093ec6e92a59f9a4b34465c43
BLAKE2b-256 3f5ab90a0198cd94fbcb4b894fb1487a63c581dc548e5859bec20ddfe865e436

See more details on using hashes here.

File details

Details for the file libzim-2.0.0-cp39-cp39-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for libzim-2.0.0-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 ce87cd9331e32324c66ac778725fc32782cf3c422d464739617610b26c9f414e
MD5 4987195b513baa49dad5adac3b8210bc
BLAKE2b-256 a09a5c60b2b12390c713ba05889c49cb50c958188ac48ba4abc2e7f440b6af44

See more details on using hashes here.

File details

Details for the file libzim-2.0.0-cp38-cp38-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for libzim-2.0.0-cp38-cp38-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 fb31ecf7da7bdd492f999b3b0fdc73a6bf82c48dca170663d97fee55e933700f
MD5 97edbadc7352b7388286b044c0d27066
BLAKE2b-256 0485f5bd2f9a8338dda1be993d4f9b6c2c966465521369c61f5e8b1a651e0007

See more details on using hashes here.

File details

Details for the file libzim-2.0.0-cp38-cp38-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for libzim-2.0.0-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 4bba76f25d3fdfb6ccc174b5bf8701090cd212d160e297c99ffc79e69da0e1ea
MD5 84b7a701bb702971148419dffd90cae9
BLAKE2b-256 8e2aaaceaf7a5488c5772b80c3a4194f5242c60117b18b4294a2e5b562a0a851

See more details on using hashes here.

File details

Details for the file libzim-2.0.0-cp37-cp37m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for libzim-2.0.0-cp37-cp37m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 706589087507b1849bc09accaf0400f4d4042cb0e8ebf804bef974f10035bafb
MD5 09981613ffecd9ed1ce3a6852f769127
BLAKE2b-256 410ea5f657a820ca5122ac69cb41c856d52e96cc226ce84b3f621db4038687a4

See more details on using hashes here.

File details

Details for the file libzim-2.0.0-cp37-cp37m-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for libzim-2.0.0-cp37-cp37m-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 d1981b20b853b3f889582bb1ae7f70ce08f73faeb69c2bd0cc2efac7b90a7c5f
MD5 8a2d474d4555127fd742650ad9fc0ac2
BLAKE2b-256 de662987fe8367c3cf4f3b09397feb80629e7ede90c312d7b1931fe047b14602

See more details on using hashes here.

File details

Details for the file libzim-2.0.0-cp36-cp36m-manylinux1_x86_64.whl.

File metadata

  • Download URL: libzim-2.0.0-cp36-cp36m-manylinux1_x86_64.whl
  • Upload date:
  • Size: 9.4 MB
  • Tags: CPython 3.6m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.3 readme-renderer/34.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.12 tqdm/4.64.1 importlib-metadata/4.8.3 keyring/23.4.1 rfc3986/1.5.0 colorama/0.4.5 CPython/3.6.15

File hashes

Hashes for libzim-2.0.0-cp36-cp36m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 12877c12ebc79429418434df362d9cf0a4f9a0c4c4ccd85449128db755abf4cd
MD5 3fd703250e472107ce17b63df492c9a3
BLAKE2b-256 7debe8c1db605e447ce25b8df37d3d3cf2e81b95dbbbe4a6a23a5007b7396214

See more details on using hashes here.

File details

Details for the file libzim-2.0.0-cp36-cp36m-macosx_10_9_x86_64.whl.

File metadata

  • Download URL: libzim-2.0.0-cp36-cp36m-macosx_10_9_x86_64.whl
  • Upload date:
  • Size: 7.8 MB
  • Tags: CPython 3.6m, macOS 10.9+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.3 readme-renderer/34.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.12 tqdm/4.64.1 importlib-metadata/4.8.3 keyring/23.4.1 rfc3986/1.5.0 colorama/0.4.5 CPython/3.6.15

File hashes

Hashes for libzim-2.0.0-cp36-cp36m-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 ad6ec9c901284465b275b4d7c02e89549fa5b49b98cf1db37e378b78a5bd07be
MD5 c2c4c6313a61285be56b8ff5fba769dc
BLAKE2b-256 78c1b25e9efa6869541231ea764b8ed2ae46e3e04e7bef36d6839e17d9ad4088

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page