A python-facing API for creating and interacting with ZIM files
Project description
python-libzim
The Python bindings for libzim
.
This library allows you to interact with .zim
files via Python.
It just provides a shallow Python interface on top of the libzim
C++ library (maintained by OpenZIM).
It is primarily used by sotoki
.
Installation
# Install from PyPI: https://pypi.org/project/libzim/
pip3 install libzim
Quickstart
Reader API
from libzim.reader import File
f = File("test.zim")
article = f.get_article("article/url.html")
print(article.url, article.title)
if not article.is_redirect():
print(article.content)
Write API
See example for a basic usage of the writer API.
User Documentation
Setup: Ubuntu/Debian and macOS x86_64
(Recommended)
Install the python libzim
package from PyPI.
pip3 install libzim
The x86_64
linux and macOS wheels automatically includes the libzim.(so|dylib)
dylib and headers, but other platforms may need to install libzim
and its headers manually.
Installing the libzim
dylib and headers manually
If you are not on a linux or macOS x86_64
platform, you will have to install libzim manually.
Either by get a prebuilt binary at https://download.openzim.org/release/libzim
or compile libzim
from source.
If you have not installed libzim in standard directory, you will have to set LD_LIBRARY_PATH
to allow python to find the library :
Assuming you have extracted (or installed) the library if LIBZIM_DIR:
export LD_LIBRARY_PATH="${LIBZIM_DIR}/lib/x86_64-linux-gnu:$LD_LIBRARY_PATH"
Setup: Docker (Optional)
docker build . --tag openzim:python-libzim
# Run a custom script inside the container
docker run -it openzim:python-libzim ./some_example_script.py
# Or use the python repl interactively
docker run -it openzim:python-libzim
>>> import libzim
Developer Documentation
These instructions are for developers working on the python-libzim
source code itself. If you are simply a user of the library and you don't intend to change its internal source code, follow the User Documentation instructions above instead.
Setup: Ubuntu/Debian
Note: Make sure you've installed libzim
dylib + headers first (see above).
apt install coreutils wget git ca-certificates \
g++ pkg-config libtool automake autoconf make meson ninja-build \
liblzma-dev zlib1g-dev libicu-dev libgumbo-dev libmagic-dev
pip3 install --upgrade pip pipenv
export CFLAGS="-I${LIBZIM_DIR}/include"
export LDFLAGS="-L${LIBZIM_DIR}/lib/x86_64-linux-gnu"
git clone https://github.com/openzim/python-libzim
cd python-libzim
python setup.py build_ext
pipenv install --dev
pipenv run pip install -e .
Setup: Docker
docker build . -f Dockerfile.dev --tag openzim:python-libzim-dev
docker run -it openzim:python-libzim-dev ./some_example_script.py
docker run -it openzim:python-libzim-dev
$ black . && flake8 . && pytest .
$ pipenv install --dev <newpackagehere>
$ python setup.py build_ext
$ python setup.py sdist bdist_wheel
$ python setup.py install
$ python -c "import libzim"
Common Tasks
Run Linters & Tests
# Autoformat code with black
black --exclude=setup.py .
# Lint and check for errors with flake8
flake8 --exclude=setup.py .
# Typecheck with mypy (optional)
mypy .
# Run tests
pytest .
Rebuild Cython extension during development
rm libzim/libzim.cpp
rm -Rf build
rm -Rf *.so
python setup.py build_ext
python setup.py install
Build package sdist
and bdist_wheels
for PyPI
python setup.py build_ext
python setup.py sdist bdist_wheel
# upload to PyPI (caution: this is done automatically via Github Actions)
twine upload dist/*
Use a specific libzim
dylib and headers when compiling python-libzim
export CFLAGS="-I${LIBZIM_DIR}/include"
export LDFLAGS="-L${LIBZIM_DIR}/lib/x86_64-linux-gnu"
export LD_LIBRARY_PATH="${LIBZIM_DIR}/lib/x86_64-linux-gnu:$LD_LIBRARY_PATH"
python setup.py build_ext
python setup.py install
Further Reading
Related Projects
- https://github.com/openzim/sotoki
- https://framagit.org/mgautierfr/pyzim
- https://github.com/pediapress/pyzim
- https://github.com/jarondl/pyzimmer/blob/master/pyzimmer/zim_writer.py
Research
- https://github.com/cython/cython/wiki/AutoPxd
- https://www.youtube.com/watch?v=YReJ3pSnNDo
- https://github.com/openzim/zim-tools/blob/master/src/zimrecreate.cpp
- https://github.com/cython/cython/wiki/enchancements-inherit_CPP_classes
- https://groups.google.com/forum/#!topic/cython-users/vAB9hbLMxRg
Debugging
- https://cython.readthedocs.io/en/latest/src/userguide/debugging.html
- https://github.com/cython/cython/wiki/DebuggingTechniques
- https://stackoverflow.com/questions/2663841/python-tracing-a-segmentation-fault
- https://cython-devel.python.narkive.com/cW3Cn1th/debugging-a-segfault-in-a-cython-generated-module
- https://groups.google.com/forum/#!topic/cython-users/B_Sxj2NV1PE
Packaging
- https://download.openzim.org/release/libzim/
- https://cibuildwheel.readthedocs.io/en/stable/faq/
- https://github.com/pypa/manylinux
- https://github.com/RalfG/python-wheels-manylinux-build/blob/master/full_workflow_example.yml
- https://packaging.python.org/guides/packaging-binary-extensions/#publishing-binary-extensions
License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Hashes for libzim-0.0.3-cp38-cp38-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | ef10d2328a5094a57ee0505db90decf2fc9704350b368bb7d75c5775f058c0cd |
|
MD5 | 7d6c3ff263f277b0c8dcdcf355a6d17b |
|
BLAKE2b-256 | 999469865ce0bce64b9bbc2c1085915bcae199b590a998d25f869dcdbaa025c9 |
Hashes for libzim-0.0.3-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3eb184b216facfa6abdbed4d89e6e61f734233806daf95a0b3742c048c317251 |
|
MD5 | cece9624f6f488e42909d6d5222c074d |
|
BLAKE2b-256 | 0c3b85f50ac7ef9d4ac46146c16375b2fa48c6d1cbde5164df7c28ae46cf57c2 |
Hashes for libzim-0.0.3-cp37-cp37m-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e9d6d0820967f33ebf462c195d498c36d283bbab33158aab0680c979629bfb22 |
|
MD5 | 12684ed90a600ac65307c953f5fe2fdb |
|
BLAKE2b-256 | 6fc33f101ed81fabb999d15fe057d1e89c90363bbf4006bdbe61718d6ed27d8f |
Hashes for libzim-0.0.3-cp37-cp37m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5075e9eb8d5c2c62b44881ce96ae0ef87e2a2cc2d5aae2d27f31ec2f639e9d15 |
|
MD5 | 9f947f74f5e734b8f762485bd2b4c8db |
|
BLAKE2b-256 | 9127470ddaa0a83cb8358f715c2c26b7deb6c365fccc1226f56b2a70855617ae |
Hashes for libzim-0.0.3-cp36-cp36m-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 22b4a19ae8d35ecb769e5b366dddbcddaba69c7e12b99f4080213859ed49967b |
|
MD5 | 5f2710792299cb2f0bd813de879760c8 |
|
BLAKE2b-256 | effe71a5959ca4497c98269cd0bcc6741e52ddc6e2b0caddc5ce97c4311dbff0 |
Hashes for libzim-0.0.3-cp36-cp36m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 75bbaf7fb325f2046eaac8c99cb9aedb8e92d1a01e352bb7e7b29944d34bed07 |
|
MD5 | c6d3ba3e2bb7299a8cb9e5918e55ee05 |
|
BLAKE2b-256 | 2d4ca281d3ce90961b608bb5b5dfb38c4bd3c7f7e087d986a7bc7921657bcd50 |