A simple tool to extract articles from a .zim file into .txt files.
Project description
zim2txt is a Python module that scrapes through a .zim file and creates .txt files from each article it contains. This tool is designed for Linux systems but it works with WSL (Windows Subsystem for Linux) as well. You must install zim-tools (sudo apt-get install zimtools)
in advance for this module to work. Here is how to use the module:
import zim2txt
zim2txt.ZimTools.Export("Path for .zim file", "Path for a temporary folder that will be deleted later (I used /usr/games/newfolder with WSL since it didn't work for any folder that is out of root directory. If it does for you, then you can use any other folder as well.)", "Path for .txt files to be saved (do not use same path with temporary files)", "encoding method, default set to utf8")
# Example
import zim2txt
zim2txt.ZimTools.Export("/data/articles.zim", "/usr/games/newfolder") # You don't have to pass encoding method any argument if you're cool with utf8
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
zim2txt-1.0.0.tar.gz
(2.2 kB
view details)
File details
Details for the file zim2txt-1.0.0.tar.gz
.
File metadata
- Download URL: zim2txt-1.0.0.tar.gz
- Upload date:
- Size: 2.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 923c0b6edd045434e0e3982cba897e6039e5303dbdb3f0466f89d88cfee7dd0c |
|
MD5 | c4eb8fe4be35ae7144558cd8ba686733 |
|
BLAKE2b-256 | c3893ecc59073dfb3877110cb51b4343178d72f2e3ee5ed29459e24e33b86d02 |