Skip to main content

A python-facing API for creating and interacting with ZIM files

Project description

python-libzim

libzim module allows you to read and write ZIM files in Python. It provides a shallow python interface on top of the C++ libzim library.

It is primarily used in openZIM scrapers like sotoki or youtube2zim.

Build Status CodeFactor License: GPL v3 PyPI version shields.io codecov

Installation

pip install libzim

The PyPI package is available for x86_64 macOS and GNU/Linux only. It bundles a recent release of the C++ libzim.

On other platforms, you'd have to compile C++ libzim from source first then build this one, adjusting LD_LIBRARY_PATH.

Contributions

git clone git@github.com:openzim/python-libzim.git && cd python-libzim
# python -m venv env && source env/bin/activate
pip install -U setuptools invoke
invoke download-libzim install-dev build-ext test
# invoke --list for available development helpers

See CONTRIBUTING.md for additional details then Open a ticket or submit a Pull Request on Github 🤗!

Usage

Read a ZIM file

from libzim.reader import Archive
from libzim.search import Query, Searcher
from libzim.suggestion import SuggestionSearcher

zim = Archive("test.zim")
print(f"Main entry is at {zim.main_entry.get_item().path}")
entry = zim.get_entry_by_path("home/fr")
print(f"Entry {entry.title} at {entry.path} is {entry.get_item().size}b.")
print(bytes(entry.get_item().content).decode("UTF-8"))

# searching using full-text index
search_string = "Welcome"
query = Query().set_query(search_string)
searcher = Searcher(zim)
search = searcher.search(query)
search_count = search.getEstimatedMatches()
print(f"there are {search_count} matches for {search_string}")
print(list(search.getResults(0, search_count)))

# accessing suggestions
search_string = "kiwix"
suggestion_searcher = SuggestionSearcher(zim)
suggestion = suggestion_searcher.suggest(search_string)
suggestion_count = suggestion.getEstimatedMatches()
print(f"there are {suggestion_count} matches for {search_string}")
print(list(suggestion.getResults(0, suggestion_count)))

Write a ZIM file

from libzim.writer import Creator, Item, StringProvider, FileProvider, Hint


class MyItem(Item):
    def __init__(self, title, path, content = "", fpath = None):
        super().__init__()
        self.path = path
        self.title = title
        self.content = content
        self.fpath = fpath

    def get_path(self):
        return self.path

    def get_title(self):
        return self.title

    def get_mimetype(self):
        return "text/html"

    def get_contentprovider(self):
        if self.fpath is not None:
            return FileProvider(self.fpath)
        return StringProvider(self.content)
       
    def get_hints(self):
        return {Hint.FRONT_ARTICLE: True}


content = """<html><head><meta charset="UTF-8"><title>Web Page Title</title></head>
<body><h1>Welcome to this ZIM</h1><p>Kiwix</p></body></html>"""

item = MyItem("Hello Kiwix", "home", content)
item2 = MyItem("Bonjour Kiwix", "home/fr", None, "home-fr.html")

with Creator("test.zim").config_indexing(True, "eng") as creator:
    creator.set_mainpath("home")
    creator.add_item(item)
    creator.add_item(item2)
    for name, value in {
        "creator": "python-libzim",
        "description": "Created in python",
        "name": "my-zim",
        "publisher": "You",
        "title": "Test ZIM",
    }.items():

        creator.add_metadata(name.title(), value)

License

GPLv3 or later, see LICENSE for more details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

libzim-1.0.0.tar.gz (8.1 MB view details)

Uploaded Source

Built Distributions

libzim-1.0.0-cp39-cp39-manylinux1_x86_64.whl (9.3 MB view details)

Uploaded CPython 3.9

libzim-1.0.0-cp39-cp39-macosx_10_9_x86_64.whl (7.8 MB view details)

Uploaded CPython 3.9 macOS 10.9+ x86-64

libzim-1.0.0-cp38-cp38-manylinux1_x86_64.whl (9.4 MB view details)

Uploaded CPython 3.8

libzim-1.0.0-cp38-cp38-macosx_10_9_x86_64.whl (7.8 MB view details)

Uploaded CPython 3.8 macOS 10.9+ x86-64

libzim-1.0.0-cp37-cp37m-manylinux1_x86_64.whl (9.3 MB view details)

Uploaded CPython 3.7m

libzim-1.0.0-cp37-cp37m-macosx_10_9_x86_64.whl (7.8 MB view details)

Uploaded CPython 3.7m macOS 10.9+ x86-64

libzim-1.0.0-cp36-cp36m-manylinux1_x86_64.whl (9.3 MB view details)

Uploaded CPython 3.6m

libzim-1.0.0-cp36-cp36m-macosx_10_9_x86_64.whl (7.8 MB view details)

Uploaded CPython 3.6m macOS 10.9+ x86-64

File details

Details for the file libzim-1.0.0.tar.gz.

File metadata

  • Download URL: libzim-1.0.0.tar.gz
  • Upload date:
  • Size: 8.1 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.6.15

File hashes

Hashes for libzim-1.0.0.tar.gz
Algorithm Hash digest
SHA256 e6fa778639dd1db93b442094d0a7375093f74445c263a625176c234ede210db6
MD5 b4ee95bfb5601548ada44ec330d49761
BLAKE2b-256 8892e2d876c474fa59e900196ac7cffd703b882d5b4920d1b2291dcdceb8a359

See more details on using hashes here.

File details

Details for the file libzim-1.0.0-cp39-cp39-manylinux1_x86_64.whl.

File metadata

  • Download URL: libzim-1.0.0-cp39-cp39-manylinux1_x86_64.whl
  • Upload date:
  • Size: 9.3 MB
  • Tags: CPython 3.9
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.7

File hashes

Hashes for libzim-1.0.0-cp39-cp39-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 3b6b2bbb688ef046d1fb932f5bd7cc584b877b6439fae71e7ff57ce5ea88849d
MD5 f216c529819794190d6cef44ab33bc0c
BLAKE2b-256 868784fb0b847f8ce01a895e3938ac746ffa9ef6e7ce22250ef85369c609a48b

See more details on using hashes here.

File details

Details for the file libzim-1.0.0-cp39-cp39-macosx_10_9_x86_64.whl.

File metadata

  • Download URL: libzim-1.0.0-cp39-cp39-macosx_10_9_x86_64.whl
  • Upload date:
  • Size: 7.8 MB
  • Tags: CPython 3.9, macOS 10.9+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.7

File hashes

Hashes for libzim-1.0.0-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 00adcca361da74768f1365210b24bd563f76617faa06b684e61faa8caa3a2d22
MD5 2edbcee08ea225e975f61df27dd4f114
BLAKE2b-256 e4a67af61e5e033bb7635c4f7ea8cadd01c7d2f36a7ab24a63b1989a6b71f13d

See more details on using hashes here.

File details

Details for the file libzim-1.0.0-cp38-cp38-manylinux1_x86_64.whl.

File metadata

  • Download URL: libzim-1.0.0-cp38-cp38-manylinux1_x86_64.whl
  • Upload date:
  • Size: 9.4 MB
  • Tags: CPython 3.8
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.12

File hashes

Hashes for libzim-1.0.0-cp38-cp38-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 e1624a949a1ebc60574b586f6e35a804e7fa08c35d0aba04a62655b7d3f15d9c
MD5 0ba867fb85d7b658cf48338a4e817dcc
BLAKE2b-256 3d51fd43fc749085a861f667c4b18a7d97e78926d0e2cdd564345d6708a7dc49

See more details on using hashes here.

File details

Details for the file libzim-1.0.0-cp38-cp38-macosx_10_9_x86_64.whl.

File metadata

  • Download URL: libzim-1.0.0-cp38-cp38-macosx_10_9_x86_64.whl
  • Upload date:
  • Size: 7.8 MB
  • Tags: CPython 3.8, macOS 10.9+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.12

File hashes

Hashes for libzim-1.0.0-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 0af13ceeb03a990a6253ac734bc6ed61f529db7f8252c7e15b892f37f43a1612
MD5 292e3a966242b3da1bf6e35beb057128
BLAKE2b-256 eaac8f34396754b46275a3d360572469d507a01ce05279166829501712079df5

See more details on using hashes here.

File details

Details for the file libzim-1.0.0-cp37-cp37m-manylinux1_x86_64.whl.

File metadata

  • Download URL: libzim-1.0.0-cp37-cp37m-manylinux1_x86_64.whl
  • Upload date:
  • Size: 9.3 MB
  • Tags: CPython 3.7m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.7.12

File hashes

Hashes for libzim-1.0.0-cp37-cp37m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 8457db103e319a55f4a72d39b5b677e994170f0e12172f90785ea77f055d56e8
MD5 35eb998ba44bb57729e86fce26377c63
BLAKE2b-256 2206de46385bc03a9a087969d97cccfad5ccc31dfb2d988b8feebaf90f6f2e3d

See more details on using hashes here.

File details

Details for the file libzim-1.0.0-cp37-cp37m-macosx_10_9_x86_64.whl.

File metadata

  • Download URL: libzim-1.0.0-cp37-cp37m-macosx_10_9_x86_64.whl
  • Upload date:
  • Size: 7.8 MB
  • Tags: CPython 3.7m, macOS 10.9+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.7.12

File hashes

Hashes for libzim-1.0.0-cp37-cp37m-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 1281e1bc4a242cd3d91d44aaf19b11f7f05c7848031da528dc52861665812bff
MD5 936fca896b7e391dd1c39e99119075c4
BLAKE2b-256 6a43fabcce58c05bf2f7627887a8854ce211434f01fba06827d814dba5724df1

See more details on using hashes here.

File details

Details for the file libzim-1.0.0-cp36-cp36m-manylinux1_x86_64.whl.

File metadata

  • Download URL: libzim-1.0.0-cp36-cp36m-manylinux1_x86_64.whl
  • Upload date:
  • Size: 9.3 MB
  • Tags: CPython 3.6m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.6.15

File hashes

Hashes for libzim-1.0.0-cp36-cp36m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 4907bb0a3880c3a283c7449977e091ff1ea588ac9f86419d777672f77118b21c
MD5 ce42b1393cad7ef93195322dfda5e984
BLAKE2b-256 690002fb5f561c4f69f5ba06e16c37ed375db083b7e14cf5974af01199d9d57b

See more details on using hashes here.

File details

Details for the file libzim-1.0.0-cp36-cp36m-macosx_10_9_x86_64.whl.

File metadata

  • Download URL: libzim-1.0.0-cp36-cp36m-macosx_10_9_x86_64.whl
  • Upload date:
  • Size: 7.8 MB
  • Tags: CPython 3.6m, macOS 10.9+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.6.15

File hashes

Hashes for libzim-1.0.0-cp36-cp36m-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 12a52f0c86fe25e963cdefc9cb8e7d71adb7a6f1ece0199c958fc40725367d3a
MD5 ddca7a860c218213a973e83fbd50c85f
BLAKE2b-256 caf806332c026376a42e4291dab86c0744f4f4396a191f455b63b1a4ea03e6f0

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page