Skip to main content

A tool to translate scanned books

Project description

CodeQL

polybiblioglot

A OCR tool to convert books scans into text and automatically translate them.

Installation / Setup

Requirements

Tesseract

polybiblioglot uses tesseract for OCR, you will need to follow the steps described here to install tesseract.

On macos, you may find this gist useful.

Poppler

Poppler is a pdf renderer. In this case, we use it to convert pdf's to images for processing. If you are only converting images, it isn't needed. Please note that the program may crash if you don't install poppler and attempt to convert a pdf.

The pdf2image github explains how to install poppler depending on what platform you are on. If you are on mac and have brew installed. It's as simple as brew install poppler

Installation

Clone the repository: git clone https://github.com/bruno-robert/polybiblioglot.git

cd into it: cd polybiblioglot

(optional) create a python virtual environment: python -m venv env then source ./env/bin/activate

install python dependancies pip install -r requirements.txt

Running polybiblioglot

To run polybiblioglot, simply execute the main.py file python main.py

Notes and limitation (for now)

Limitations

  • The OCR method used is optimized for high acuracy and not speed. I might add the functionality to change this in the future.

Notes

  • All computationally expensive or I/O intensive tasks are run asynchronously. This keeps the UI snappy. I'm currently using the DearPyGUI asynchronous call method wich will be depricated in the next version. A migration to python's out of the box threading library will be needed at that point.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

polybiblioglot-0.2.0.tar.gz (23.4 kB view details)

Uploaded Source

Built Distribution

polybiblioglot-0.2.0-py3-none-any.whl (36.4 kB view details)

Uploaded Python 3

File details

Details for the file polybiblioglot-0.2.0.tar.gz.

File metadata

  • Download URL: polybiblioglot-0.2.0.tar.gz
  • Upload date:
  • Size: 23.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.5 CPython/3.8.5 Darwin/20.3.0

File hashes

Hashes for polybiblioglot-0.2.0.tar.gz
Algorithm Hash digest
SHA256 bd3521c2c5c2d403955af9a76ca84f40cb07d8ef4a2f2e669890a8bd71de4d19
MD5 0cbcf171164ae858c3d9e40b59d8061b
BLAKE2b-256 9223496c350edf8c8def7adfdaad049d000c41f45b6167e44dad45fa92069d81

See more details on using hashes here.

File details

Details for the file polybiblioglot-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: polybiblioglot-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 36.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.5 CPython/3.8.5 Darwin/20.3.0

File hashes

Hashes for polybiblioglot-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 0f901d59d210a56bf984d3339d4f856dbe90367fe8c5bc207bce683cc0b5d72d
MD5 f5b5321de51fc10ea28d54794a60b668
BLAKE2b-256 b6cba543d76ea3f02d14ad7ed8acddbc1da34239bd188c5d578bec4b176ff0ec

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page