A python-facing API for creating and interacting with ZIM files
Project description
python-libzim
libzim
module allows you to read and write ZIM
files in Python. It provides a shallow python
interface on top of the C++ libzim
library.
It is primarily used in openZIM scrapers like sotoki
or youtube2zim
.
Installation
pip install libzim
The PyPI package is available for x86_64 macOS and GNU/Linux only. It bundles a recent release of the C++ libzim.
On other platforms, you'd have to compile C++ libzim from
source first then build this one, adjusting LD_LIBRARY_PATH
.
Contributions
git clone git@github.com:openzim/python-libzim.git && cd python-libzim
# python -m venv env && source env/bin/activate
pip install -U setuptools invoke
invoke download-libzim install-dev build-ext test
# invoke --list for available development helpers
See CONTRIBUTING.md for additional details then Open a ticket or submit a Pull Request on Github 🤗!
Usage
Read a ZIM file
from libzim.reader import Archive
from libzim.search import Query, Searcher
from libzim.suggestion import SuggestionSearcher
zim = Archive("test.zim")
print(f"Main entry is at {zim.main_entry.get_item().path}")
entry = zim.get_entry_by_path("home/fr")
print(f"Entry {entry.title} at {entry.path} is {entry.get_item().size}b.")
print(bytes(entry.get_item().content).decode("UTF-8"))
# searching using full-text index
search_string = "Welcome"
query = Query().set_query(search_string)
searcher = Searcher(zim)
search = searcher.search(query)
search_count = search.getEstimatedMatches()
print(f"there are {search_count} matches for {search_string}")
print(list(search.getResults(0, search_count)))
# accessing suggestions
search_string = "kiwix"
suggestion_searcher = SuggestionSearcher(zim)
suggestion = suggestion_searcher.suggest(search_string)
suggestion_count = suggestion.getEstimatedMatches()
print(f"there are {suggestion_count} matches for {search_string}")
print(list(suggestion.getResults(0, suggestion_count)))
Write a ZIM file
from libzim.writer import Creator, Item, StringProvider, FileProvider, Hint
class MyItem(Item):
def __init__(self, title, path, content = "", fpath = None):
super().__init__()
self.path = path
self.title = title
self.content = content
self.fpath = fpath
def get_path(self):
return self.path
def get_title(self):
return self.title
def get_mimetype(self):
return "text/html"
def get_contentprovider(self):
if self.fpath is not None:
return FileProvider(self.fpath)
return StringProvider(self.content)
def get_hints(self):
return {Hint.FRONT_ARTICLE: True}
content = """<html><head><meta charset="UTF-8"><title>Web Page Title</title></head>
<body><h1>Welcome to this ZIM</h1><p>Kiwix</p></body></html>"""
item = MyItem("Hello Kiwix", "home", content)
item2 = MyItem("Bonjour Kiwix", "home/fr", None, "home-fr.html")
with Creator("test.zim").config_indexing(True, "eng") as creator:
creator.set_mainpath("home")
creator.add_item(item)
creator.add_item(item2)
for name, value in {
"creator": "python-libzim",
"description": "Created in python",
"name": "my-zim",
"publisher": "You",
"title": "Test ZIM",
}.items():
creator.add_metadata(name.title(), value)
License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
libzim-1.1.1.tar.gz
(8.3 MB
view hashes)
Built Distributions
Close
Hashes for libzim-1.1.1-cp310-cp310-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 93b979872c3300d3b2eab12a1b738dbbf67c17560d9002b26e99c9b41b56ac49 |
|
MD5 | 5a30aaa6326076fda169e92b7e80e596 |
|
BLAKE2b-256 | ea54c60ec4ae2b72c0f43f671acf95dde27ab9200b1c8c144d1d55fd834ba360 |
Close
Hashes for libzim-1.1.1-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7c6f9e573c7cdc9d2f9f4355d1c1af24b6e5e3bd76742a1ee5c42fd8b68221a6 |
|
MD5 | 34616267d3c56eec5ff62b6238f2662f |
|
BLAKE2b-256 | 77730fe64c247db5c868291bef6aeab9b54de45adfc8e1d148573bc7d88933f3 |
Close
Hashes for libzim-1.1.1-cp39-cp39-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a1f3f1a600c8f99db176519e255c4c26f3d51e3c254e8e02ee55d9aae00aed5e |
|
MD5 | a785aebec0b23b43266435613f0a843a |
|
BLAKE2b-256 | 5b06e3ae2f841f109620d13a104dc086936b12601b81a49afc14f62227590b4d |
Close
Hashes for libzim-1.1.1-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e476e0800dca768d2ba7afbc61e4696379a2fcb9e7639e60105918694abe3355 |
|
MD5 | 73610776b4a05d1283bcd571e0bc40e2 |
|
BLAKE2b-256 | b101a531a92a84089dcdd0491d2c27533d9f086dc0f7b34c23591bb5af249eb8 |
Close
Hashes for libzim-1.1.1-cp38-cp38-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4c42d43c6fe7185c93f9e14e84477634c5a2824d66964bcbbbfd213c067d87a2 |
|
MD5 | d3efa9c07fd3782118a4a346fe9b65f8 |
|
BLAKE2b-256 | 810415cd361e2caef5488bce40415eb4789d330c63c14a8de4ff21838766738e |
Close
Hashes for libzim-1.1.1-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | bc4bd130ee38f9802efcb3d2b3a593a1ffde7852a1b1a6d0aa881011283cf731 |
|
MD5 | 279e8fd13b3795e4938a1d4b6b6c13e9 |
|
BLAKE2b-256 | f49b488fc58285326a68e9de43a74475407e6965c01bd0d14181a8cd3527ccca |
Close
Hashes for libzim-1.1.1-cp37-cp37m-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 753f155b2b25e21b51d39dc1a8f1b8f9d643025ed6ee37ebe44351ef876c9229 |
|
MD5 | 2f641e21226f0eae12987cc82edcbfcc |
|
BLAKE2b-256 | b1b96851bb11cf5d9c5ce5c7220d48d85f352516ba5b3597b4ca590306ac64f6 |
Close
Hashes for libzim-1.1.1-cp37-cp37m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 47ceca95f9deb47dda6f16b2cf9523df3e9ccc575b5ceba971fb5e9cfa1bb699 |
|
MD5 | 6902cd6d652d7c6cb3c220736422e1d3 |
|
BLAKE2b-256 | c48b68e90bdfbbe92bed05e574b8d4aa946d8e5e775c14c7921a313229512736 |
Close
Hashes for libzim-1.1.1-cp36-cp36m-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9b08d0322fb93f5262577a590dd096ec6a8969017023b896dc2817678e4c2634 |
|
MD5 | a4e978ee2a339d02584831cd2ce6ae1f |
|
BLAKE2b-256 | 15d8127a4a8c45b411fc356fa823d83788f5d46a488f32e82664712753d56506 |
Close
Hashes for libzim-1.1.1-cp36-cp36m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 00d557552db5f1df194982f7cb7c855dbd36e7ef8345e8d7ed36a17b2e4c1406 |
|
MD5 | c7bb502c388ed754047f4a7ebce17437 |
|
BLAKE2b-256 | 3b498bbbe7f4075a11a1c1efa9f90249860d2c753bcd1e4e25e3a419c12fc8dd |