A python-facing API for creating and interacting with ZIM files
Project description
python-libzim
libzim
module allows you to read and write ZIM
files in Python. It provides a shallow python
interface on top of the C++ libzim
library.
It is primarily used in openZIM scrapers like sotoki
or youtube2zim
.
Installation
pip install libzim
The PyPI package is available for x86_64 macOS and GNU/Linux only. It bundles a recent release of the C++ libzim.
On other platforms, you'd have to compile C++ libzim from
source first then build this one, adjusting LD_LIBRARY_PATH
.
Contributions
git clone git@github.com:openzim/python-libzim.git && cd python-libzim
# python -m venv env && source env/bin/activate
pip install -U setuptools invoke
invoke download-libzim install-dev build-ext test
# invoke --list for available development helpers
See CONTRIBUTING.md for additional details then Open a ticket or submit a Pull Request on Github 🤗!
Usage
Read a ZIM file
from libzim.reader import Archive
from libzim.search import Query, Searcher
from libzim.suggestion import SuggestionSearcher
zim = Archive("test.zim")
print(f"Main entry is at {zim.main_entry.get_item().path}")
entry = zim.get_entry_by_path("home/fr")
print(f"Entry {entry.title} at {entry.path} is {entry.get_item().size}b.")
print(bytes(entry.get_item().content).decode("UTF-8"))
# searching using full-text index
search_string = "Welcome"
query = Query().set_query(search_string)
searcher = Searcher(zim)
search = searcher.search(query)
search_count = search.getEstimatedMatches()
print(f"there are {search_count} matches for {search_string}")
print(list(search.getResults(0, search_count)))
# accessing suggestions
search_string = "kiwix"
suggestion_searcher = SuggestionSearcher(zim)
suggestion = suggestion_searcher.suggest(search_string)
suggestion_count = suggestion.getEstimatedMatches()
print(f"there are {suggestion_count} matches for {search_string}")
print(list(suggestion.getResults(0, suggestion_count)))
Write a ZIM file
from libzim.writer import Creator, Item, StringProvider, FileProvider, Hint
class MyItem(Item):
def __init__(self, title, path, content = "", fpath = None):
super().__init__()
self.path = path
self.title = title
self.content = content
self.fpath = fpath
def get_path(self):
return self.path
def get_title(self):
return self.title
def get_mimetype(self):
return "text/html"
def get_contentprovider(self):
if self.fpath is not None:
return FileProvider(self.fpath)
return StringProvider(self.content)
def get_hints(self):
return {Hint.FRONT_ARTICLE: True}
content = """<html><head><meta charset="UTF-8"><title>Web Page Title</title></head>
<body><h1>Welcome to this ZIM</h1><p>Kiwix</p></body></html>"""
item = MyItem("Hello Kiwix", "home", content)
item2 = MyItem("Bonjour Kiwix", "home/fr", None, "home-fr.html")
with Creator("test.zim").config_indexing(True, "eng") as creator:
creator.set_mainpath("home")
creator.add_item(item)
creator.add_item(item2)
for name, value in {
"creator": "python-libzim",
"description": "Created in python",
"name": "my-zim",
"publisher": "You",
"title": "Test ZIM",
}.items():
creator.add_metadata(name.title(), value)
License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
libzim-2.1.0.tar.gz
(8.3 MB
view hashes)
Built Distributions
Close
Hashes for libzim-2.1.0-cp311-cp311-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9cc8c9ba91fcfd918362abba43bb683be7f89c6feeaf6608d257c8f7d435cbab |
|
MD5 | 92bfb99ed43741d84a72461d71e8e6df |
|
BLAKE2b-256 | f6528dfd9ed244ef8d883d7383aeccc628628406cf99ca1fb3c11214e7bc6446 |
Close
Hashes for libzim-2.1.0-cp311-cp311-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 664359a032ecc04e4954aabf29f55acb48686e7ee7608defe959720bfbb21de6 |
|
MD5 | 173067cce800f32ceb0f4566cf47237f |
|
BLAKE2b-256 | 4524a50d1776d926c6f6b09d36687251d067f6a134ac904a309ff4b2af0731d8 |
Close
Hashes for libzim-2.1.0-cp310-cp310-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | cad0fc3aff397e375de865156f3ce0d3e122a38e8fcc970722158a45bafe1ed3 |
|
MD5 | 6a8be6cc527dc4c3ac545e8edf4bac68 |
|
BLAKE2b-256 | 23ac93e111bc6121d15786c8649f83c99c94061a47cd28e10b405e40680ae1ce |
Close
Hashes for libzim-2.1.0-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0d7e8814db311c2d71288f2a629724705e2c7bcccff74ed5d35d9b99b5a00355 |
|
MD5 | e98b05433afc7c2839cf89a5a153a7ec |
|
BLAKE2b-256 | b1012be11d0240b97aa46d902e204f71f4548ffe5bfc8bf6175cf0abd8dcb7ea |
Close
Hashes for libzim-2.1.0-cp39-cp39-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 50ff170b251ddc68b3ab19af6fc2d5afd764a8e0bf20fd02d5173181133d3528 |
|
MD5 | bc5e4fb8524212682c1851482e985969 |
|
BLAKE2b-256 | 167c726791d3bd2c7eb7e5e32eb96e129a54cc38db155e53b8671804c89148f2 |
Close
Hashes for libzim-2.1.0-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a191d2ecaabe9689dbbbdf12df7b8f8671c76bced86d0ba2effbb9ba04d76641 |
|
MD5 | e747f807aed556ecde4580ce9e5c41c3 |
|
BLAKE2b-256 | 6f2055e95e811ced34ff3ab274c014cbb473e7dfce347ba5ca3cc45d9fde27e2 |
Close
Hashes for libzim-2.1.0-cp38-cp38-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0189f496541c51a11b8b4a0c584c67466d807f581ea497b12fd643d33619cadf |
|
MD5 | 67abc730d5c26419b877dfe614343bdd |
|
BLAKE2b-256 | e9249788a4ca9beb22526589fb9287962242262c3e0809cd10b1dfde6889a23d |
Close
Hashes for libzim-2.1.0-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 83c14b7948a5dbb61ffc3fc6e1f7df1ae7654ab97753f4c1c27adfe8e652cc5d |
|
MD5 | 1512b9a16da32015ae8ba5a86413f406 |
|
BLAKE2b-256 | 1e7c2ea04cb5c6c6f4466b928e2187c67571e14d424d1662ca33e5137386cb58 |
Close
Hashes for libzim-2.1.0-cp37-cp37m-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 11f14b7293664a24f40b2ddc30cec16d941a2e1022d5491a858d51522676311d |
|
MD5 | a707f460634fb06c396f438ae0ab8fb5 |
|
BLAKE2b-256 | d0c4c7936f523bbaa296eb1d3250c88e8a0abc7678e12f2fea1ed24e2c9831fb |
Close
Hashes for libzim-2.1.0-cp37-cp37m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | ca5e92583f8f06d1c5d1f1d0edbd023c8a86fcccc30cc75df02a737ed341b5e5 |
|
MD5 | 63f69ff4bece1049ce3042f710bf75c6 |
|
BLAKE2b-256 | 5f15f5ded824178f9d13219f044c0ee83d43c8c04dd196fc858faf4d6b5a41d8 |
Close
Hashes for libzim-2.1.0-cp36-cp36m-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 52318ad185785a7bad6f0d86b1657310251ecbeba750110450e4a482fff38951 |
|
MD5 | 4d9f6010570a138b9d884d9a5b60dab1 |
|
BLAKE2b-256 | eed3552b3876a0e7381093ad6395c096bc2e39c21ec3c0c696ad0cd593008b63 |
Close
Hashes for libzim-2.1.0-cp36-cp36m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 956cf9f60a43b198839a14d9e35a5b339155ca3fe4ca5a8020d7b4a6a71b4e0f |
|
MD5 | 25c83b5061b70623f4df0bf1eb38392a |
|
BLAKE2b-256 | 1e41126f9a1848d7898b6b3cb34f8a0484eafa05e0ecaacf228d27ff8ed2bce2 |