A python-facing API for creating and interacting with ZIM files
Project description
python-libzim
The Python bindings for libzim
.
This library allows you to interact with .zim
files via Python.
It just provides a shallow Python interface on top of the libzim
C++ library (maintained by OpenZIM).
It is primarily used by sotoki
.
Installation
# Install from PyPI: https://pypi.org/project/libzim/
pip3 install libzim
Quickstart
Reader API
from libzim.reader import File
f = File("test.zim")
article = f.get_article("article/url.html")
print(article.url, article.title)
if not article.is_redirect():
print(article.content)
Write API
See example for a basic usage of the writer API.
User Documentation
Setup: Ubuntu/Debian and macOS x86_64
(Recommended)
Install the python libzim
package from PyPI.
pip3 install libzim
The x86_64
linux and macOS wheels automatically includes the libzim.(so|dylib)
dylib and headers, but other platforms may need to install libzim
and its headers manually.
Installing the libzim
dylib and headers manually
If you are not on a linux or macOS x86_64
platform, you will have to install libzim manually.
Either by get a prebuilt binary at https://download.openzim.org/release/libzim
or compile libzim
from source.
If you have not installed libzim in standard directory, you will have to set LD_LIBRARY_PATH
to allow python to find the library :
Assuming you have extracted (or installed) the library if LIBZIM_DIR:
export LD_LIBRARY_PATH="${LIBZIM_DIR}/lib/x86_64-linux-gnu:$LD_LIBRARY_PATH"
Setup: Docker (Optional)
docker build . --tag openzim:python-libzim
# Run a custom script inside the container
docker run -it openzim:python-libzim ./some_example_script.py
# Or use the python repl interactively
docker run -it openzim:python-libzim
>>> import libzim
Developer Documentation
These instructions are for developers working on the python-libzim
source code itself. If you are simply a user of the library and you don't intend to change its internal source code, follow the User Documentation instructions above instead.
Setup: Ubuntu/Debian
Note: Make sure you've installed libzim
dylib + headers first (see above).
apt install coreutils wget git ca-certificates \
g++ pkg-config libtool automake autoconf make meson ninja-build \
liblzma-dev zlib1g-dev libicu-dev libgumbo-dev libmagic-dev
pip3 install --upgrade pip pipenv
export CFLAGS="-I${LIBZIM_DIR}/include"
export LDFLAGS="-L${LIBZIM_DIR}/lib/x86_64-linux-gnu"
git clone https://github.com/openzim/python-libzim
cd python-libzim
python setup.py build_ext
pipenv install --dev
pipenv run pip install -e .
Setup: Docker
docker build . -f Dockerfile.dev --tag openzim:python-libzim-dev
docker run -it openzim:python-libzim-dev ./some_example_script.py
docker run -it openzim:python-libzim-dev
$ black . && flake8 . && pytest .
$ pipenv install --dev <newpackagehere>
$ python setup.py build_ext
$ python setup.py sdist bdist_wheel
$ python setup.py install
$ python -c "import libzim"
Common Tasks
Run Linters & Tests
# Autoformat code with black
black --exclude=setup.py .
# Lint and check for errors with flake8
flake8 --exclude=setup.py .
# Typecheck with mypy (optional)
mypy .
# Run tests
pytest .
Rebuild Cython extension during development
rm libzim/libzim.cpp
rm -Rf build
rm -Rf *.so
python setup.py build_ext
python setup.py install
Build package sdist
and bdist_wheels
for PyPI
python setup.py build_ext
python setup.py sdist bdist_wheel
# upload to PyPI (caution: this is done automatically via Github Actions)
twine upload dist/*
Use a specific libzim
dylib and headers when compiling python-libzim
export CFLAGS="-I${LIBZIM_DIR}/include"
export LDFLAGS="-L${LIBZIM_DIR}/lib/x86_64-linux-gnu"
export LD_LIBRARY_PATH="${LIBZIM_DIR}/lib/x86_64-linux-gnu:$LD_LIBRARY_PATH"
python setup.py build_ext
python setup.py install
Further Reading
Related Projects
- https://github.com/openzim/sotoki
- https://framagit.org/mgautierfr/pyzim
- https://github.com/pediapress/pyzim
- https://github.com/jarondl/pyzimmer/blob/master/pyzimmer/zim_writer.py
Research
- https://github.com/cython/cython/wiki/AutoPxd
- https://www.youtube.com/watch?v=YReJ3pSnNDo
- https://github.com/openzim/zim-tools/blob/master/src/zimrecreate.cpp
- https://github.com/cython/cython/wiki/enchancements-inherit_CPP_classes
- https://groups.google.com/forum/#!topic/cython-users/vAB9hbLMxRg
Debugging
- https://cython.readthedocs.io/en/latest/src/userguide/debugging.html
- https://github.com/cython/cython/wiki/DebuggingTechniques
- https://stackoverflow.com/questions/2663841/python-tracing-a-segmentation-fault
- https://cython-devel.python.narkive.com/cW3Cn1th/debugging-a-segfault-in-a-cython-generated-module
- https://groups.google.com/forum/#!topic/cython-users/B_Sxj2NV1PE
Packaging
- https://download.openzim.org/release/libzim/
- https://cibuildwheel.readthedocs.io/en/stable/faq/
- https://github.com/pypa/manylinux
- https://github.com/RalfG/python-wheels-manylinux-build/blob/master/full_workflow_example.yml
- https://packaging.python.org/guides/packaging-binary-extensions/#publishing-binary-extensions
License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Hashes for libzim-0.0.3.post0-cp38-cp38-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 95d57cf3564d2d21461b533bb90098e3628179dc1f733c1c7efba0aadc1137ae |
|
MD5 | b8ee0fb9001a32ee26fa39bf85955f94 |
|
BLAKE2b-256 | 26919f761d8776e924411b2b0518534da1448417f7262665e7d7cb6445274ee2 |
Hashes for libzim-0.0.3.post0-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4c01cc049aaa1d2afff5159eb58d4624e99a870a38cd68618c64fa153dc317ea |
|
MD5 | 116e446383c7eb85f105160a368e5ae4 |
|
BLAKE2b-256 | f34be0431f5a09cf80f88014d89829c5c68994b2ff8b65a638c524bb3dd791fc |
Hashes for libzim-0.0.3.post0-cp37-cp37m-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e414043a1b95ad45cbe268c136799538792358a4afd42495b4f888f5cd391eda |
|
MD5 | c34fd9160228ccaf1c129e6c2b60ed27 |
|
BLAKE2b-256 | 52985fb36f9760437c8c1ef8dd0173114eb4137445459baa37554fe7a423364f |
Hashes for libzim-0.0.3.post0-cp37-cp37m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0ee20a093c9c12c055b740ba703fee9679dbbe22d4727c5441830694b8f055f2 |
|
MD5 | 8f07dfa15475c511992e54aaf0f92752 |
|
BLAKE2b-256 | 3cd8d94b9b1a0c3d354fdcfd59d8ec9ebd6701960f702ac28b98e77a7bcd09cd |
Hashes for libzim-0.0.3.post0-cp36-cp36m-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d73cd0ad567686f09864fdbed4780ea728def0c73e43aa394470035262362b17 |
|
MD5 | 950bde7f9c7924cd85c4f09b3e448ed1 |
|
BLAKE2b-256 | fe5e96255f8f5ae1bfe0be86ce025c5e0d6c7eee87b85102f4a61601f5a1cba9 |
Hashes for libzim-0.0.3.post0-cp36-cp36m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 64875c316993c633bfbd31c1808cecc867a6fc6bbe37d742acb3fa1e0f2596fb |
|
MD5 | 77294035d74b03859d93fb7c011186d8 |
|
BLAKE2b-256 | 6a7e6eb1f9d7d0e73c9f3918bd80007c7fa633c90c968e8312b894ab47097336 |