A python-facing API for creating and interacting with ZIM files
Project description
python-libzim
libzim
module allows you to read and write ZIM
files in Python. It provides a shallow python
interface on top of the C++ libzim
library.
It is primarily used in openZIM scrapers like sotoki
or youtube2zim
.
Installation
pip install libzim
The PyPI package is available for x86_64 macOS and GNU/Linux only. It bundles a recent release of the C++ libzim.
On other platforms, you'd have to compile C++ libzim from
source first then build this one, adjusting LD_LIBRARY_PATH
.
Contributions
git clone git@github.com:openzim/python-libzim.git && cd python-libzim
# python -m venv env && source env/bin/activate
pip install -U setuptools invoke
invoke download-libzim install-dev build-ext test
# invoke --list for available development helpers
See CONTRIBUTING.md for additional details then Open a ticket or submit a Pull Request on Github 🤗!
Usage
Read a ZIM file
from libzim.reader import Archive
from libzim.search import Query, Searcher
from libzim.suggestion import SuggestionSearcher
zim = Archive("test.zim")
print(f"Main entry is at {zim.main_entry.get_item().path}")
entry = zim.get_entry_by_path("home/fr")
print(f"Entry {entry.title} at {entry.path} is {entry.get_item().size}b.")
print(bytes(entry.get_item().content).decode("UTF-8"))
# searching using full-text index
search_string = "Welcome"
query = Query().set_query(search_string)
searcher = Searcher(zim)
search = searcher.search(query)
search_count = search.getEstimatedMatches()
print(f"there are {search_count} matches for {search_string}")
print(list(search.getResults(0, search_count)))
# accessing suggestions
search_string = "kiwix"
suggestion_searcher = SuggestionSearcher(zim)
suggestion = suggestion_searcher.suggest(search_string)
suggestion_count = suggestion.getEstimatedMatches()
print(f"there are {suggestion_count} matches for {search_string}")
print(list(suggestion.getResults(0, suggestion_count)))
Write a ZIM file
from libzim.writer import Creator, Item, StringProvider, FileProvider, Hint
class MyItem(Item):
def __init__(self, title, path, content = "", fpath = None):
super().__init__()
self.path = path
self.title = title
self.content = content
self.fpath = fpath
def get_path(self):
return self.path
def get_title(self):
return self.title
def get_mimetype(self):
return "text/html"
def get_contentprovider(self):
if self.fpath is not None:
return FileProvider(self.fpath)
return StringProvider(self.content)
def get_hints(self):
return {Hint.FRONT_ARTICLE: True}
content = """<html><head><meta charset="UTF-8"><title>Web Page Title</title></head>
<body><h1>Welcome to this ZIM</h1><p>Kiwix</p></body></html>"""
item = MyItem("Hello Kiwix", "home", content)
item2 = MyItem("Bonjour Kiwix", "home/fr", None, "home-fr.html")
with Creator("test.zim").config_indexing(True, "eng") as creator:
creator.set_mainpath("home")
creator.add_item(item)
creator.add_item(item2)
for name, value in {
"creator": "python-libzim",
"description": "Created in python",
"name": "my-zim",
"publisher": "You",
"title": "Test ZIM",
}.items():
creator.add_metadata(name.title(), value)
License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
libzim-1.1.0.tar.gz
(8.3 MB
view hashes)
Built Distributions
Close
Hashes for libzim-1.1.0-cp310-cp310-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 321c55a55218ac8ea880e70083f10ba3014ef4435943c73831fe3178c7cf3500 |
|
MD5 | 48d5050c4f62f6bdc62c9cd1f8486932 |
|
BLAKE2b-256 | 5543fc2ae2ece51a4fb66f46635be36e95043f66cced3e31cf5fbc8d3296ce7c |
Close
Hashes for libzim-1.1.0-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a563aac41312ffcfb253a6a3111719cfaf34a042f25f8d9259faac2c9cccb2d5 |
|
MD5 | 980c8353c393491ae6424c856119d93f |
|
BLAKE2b-256 | 294373256a5826263f908a3094720676989e6e9a55e0caad7b85924227a8a55a |
Close
Hashes for libzim-1.1.0-cp39-cp39-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e67e137f01558ddf86ceac06fc9487efd1cf8d6396062ddb11b9d9c1ff18ed82 |
|
MD5 | a522854670e125b94cd28831f65f14fc |
|
BLAKE2b-256 | 544b206f3c378fa056e2773d9abc273023bc4535e00a40dda01c6e4e48a6ec41 |
Close
Hashes for libzim-1.1.0-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a9ef3e066ff8511335a1bcee102e20620b5ff847bc0515f20979bea34a2f0cd1 |
|
MD5 | 8fadc4ae1e608002c0a2578500ab0144 |
|
BLAKE2b-256 | 047abee13d64ef4308fa301883b3272dae70bba14cf9169a377146adef80235a |
Close
Hashes for libzim-1.1.0-cp38-cp38-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9020c8d9d110a8a00bf2a9fa5b63862d8832c4f19eb6ea7fb5bc4aab253b6462 |
|
MD5 | 64ff60906e0157c74547be6b57fe363a |
|
BLAKE2b-256 | 9a4a8399d70d494cb48f90c942067ccaf8aad133f609e1e7fbb4fa817eaa6f3b |
Close
Hashes for libzim-1.1.0-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 26f68e785ecd721c752389550103b0fc6691b1819b277658a3f736b1e1fc60fc |
|
MD5 | 961937d11598612308e200819ed2d951 |
|
BLAKE2b-256 | 87cba3ece5cde8f244064ecd31fb151242cd6ef3826456ce844c14e3e367afe3 |
Close
Hashes for libzim-1.1.0-cp37-cp37m-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4d1906a5208415915a22824d0893bcb9db66ee3f1f6c8e26ebadf04cafb27a12 |
|
MD5 | d119c2b1deab4a834d3901d71eae9f58 |
|
BLAKE2b-256 | 9e9763fc6500bbd967b39e9e156af49e044c9654231e5a337d9c2e54178fa934 |
Close
Hashes for libzim-1.1.0-cp37-cp37m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | cd6c7ee3a984a9682867529a08be27ae649ca08c731750f508cab72e75820b6f |
|
MD5 | 648b6f88942ae9b781d1534675b21b7c |
|
BLAKE2b-256 | 05889aa47365fcc9ba6a7527de30388e8aebe17cd1c030eeaa14b5be8befa73d |
Close
Hashes for libzim-1.1.0-cp36-cp36m-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d707731e0772b5f6f0463534eeaa71e3fc43128e54a43233b073d3239fdae740 |
|
MD5 | 3fdc1eeeb44f90e4ab523477c97046aa |
|
BLAKE2b-256 | 4f6057213607b9d7033a25f53f75ec03ab041de87ef878b4cbc67e3b3bb4d1ac |
Close
Hashes for libzim-1.1.0-cp36-cp36m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9cfe47e6eebcf7226f6a1156b9474b37adb5624ee80b051a8834adf5fdd26509 |
|
MD5 | 3f9cc4a30c897716538fecfce189e5f6 |
|
BLAKE2b-256 | b4f0227e4b661f00c43b4ab08d940d81a15ceb964cf0acea927b391bb59c2d18 |