Use OCR Tesseract to convert Hebrew PDFs to text files and then translate them.
Project description
Transklate - A small tools to convert PDF in Hebrew to txt and to translate it
The purpose of this little code is to automate the following two processes
- to transcribe a PDF written in Hebrew into a TXT file using Tesseract.
- The PDFs are converted to PNGs in a temporary folder.
- Each image is then converted to a string by Tesseract.
- translate the TXT file into the language of your choice using Google Translate.
At the moment the translation is not very good (and not as good as you'd expect from Google Translate online).
Instalation
pip install transklate
Basic CLI use
transklate <file_name.pdf> --lang en
For the output language, use the Google translate code list.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file transklate-0.1.0.tar.gz.
File metadata
- Download URL: transklate-0.1.0.tar.gz
- Upload date:
- Size: 10.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.25
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3520fdd46881575643726c4900e56376bd952ad3195db226f1444074217ba966
|
|
| MD5 |
486a9293c7a58aebd35481cecf86a01d
|
|
| BLAKE2b-256 |
b5bdf1a4df21cd491ecf6d73572fad4426d86178c8c78e313ffb3d9fe778ac9a
|
File details
Details for the file transklate-0.1.0-py3-none-any.whl.
File metadata
- Download URL: transklate-0.1.0-py3-none-any.whl
- Upload date:
- Size: 4.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.25
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ef7372d1da230c975a78f2805d03a914c1622e907c565237f96c276bf3a15c1e
|
|
| MD5 |
d7ac5e26042341efeedc6e1a86e02f95
|
|
| BLAKE2b-256 |
ae57f9b01e8cb93f4a56317fd2d61a433059f9fbbd8e4576fe49ab863ffa5ded
|