A tool to translate scanned books
Project description
polybiblioglot
A OCR tool to convert books scans into text and automatically translate them.
Installation / Setup
Requirements
Tesseract
polybiblioglot uses tesseract for OCR, you will need to follow the steps described here to install tesseract.
On macos, you may find this gist useful.
Poppler
Poppler is a pdf renderer. In this case, we use it to convert pdf's to images for processing. If you are only converting images, it isn't needed. Please note that the program may crash if you don't install poppler and attempt to convert a pdf.
The pdf2image github explains how to install poppler depending on what platform you are on.
If you are on mac and have brew installed. It's as simple as brew install poppler
Installation
Clone the repository:
git clone https://github.com/bruno-robert/polybiblioglot.git
cd into it:
cd polybiblioglot
(optional) create a python virtual environment:
python -m venv env
then
source ./env/bin/activate
install python dependancies
pip install -r requirements.txt
Running polybiblioglot
To run polybiblioglot, simply execute the main.py file
python main.py
Notes and limitation (for now)
Limitations
- The OCR method used is optimized for high acuracy and not speed. I might add the functionality to change this in the future.
Notes
- All computationally expensive or I/O intensive tasks are run asynchronously. This keeps the UI snappy. I'm currently using the DearPyGUI asynchronous call method wich will be depricated in the next version. A migration to python's out of the box threading library will be needed at that point.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file polybiblioglot-0.1.1.tar.gz
.
File metadata
- Download URL: polybiblioglot-0.1.1.tar.gz
- Upload date:
- Size: 21.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.1.5 CPython/3.8.5 Darwin/20.3.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | bff02ccd668e9f65fc247e0095b4ba35f4f90dfa764bd17aab5ef81405d53bbc |
|
MD5 | 2c34114bd576440e5dbf37666a6b8ee7 |
|
BLAKE2b-256 | caa42b88b731ee4e2fb3ad74163a42ba09aac9d0dbea2f8ae12bf37068788d55 |
File details
Details for the file polybiblioglot-0.1.1-py3-none-any.whl
.
File metadata
- Download URL: polybiblioglot-0.1.1-py3-none-any.whl
- Upload date:
- Size: 22.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.1.5 CPython/3.8.5 Darwin/20.3.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | af8c9706fd868d0b618cf36b5dd245eddba3666b614423a4b73ec39bb45b3c08 |
|
MD5 | ba1403a55565dbbc523e123e39008397 |
|
BLAKE2b-256 | fcc816d842609dc18b3ce8f67ef3b4f7a1d30708b36044954469477bedfeea5f |