Skip to main content

A tool to translate scanned books

Project description

CodeQL

polybiblioglot

A OCR tool to convert books scans into text and automatically translate them.

Installation / Setup

Requirements

Tesseract

polybiblioglot uses tesseract for OCR, you will need to follow the steps described here to install tesseract.

On macos, you may find this gist useful.

Poppler

Poppler is a pdf renderer. In this case, we use it to convert pdf's to images for processing. If you are only converting images, it isn't needed. Please note that the program may crash if you don't install poppler and attempt to convert a pdf.

The pdf2image github explains how to install poppler depending on what platform you are on. If you are on mac and have brew installed. It's as simple as brew install poppler

Installation

Clone the repository: git clone https://github.com/bruno-robert/polybiblioglot.git

cd into it: cd polybiblioglot

(optional) create a python virtual environment: python -m venv env then source ./env/bin/activate

install python dependancies pip install -r requirements.txt

Running polybiblioglot

To run polybiblioglot, simply execute the main.py file python main.py

Notes and limitation (for now)

Limitations

  • The OCR method used is optimized for high acuracy and not speed. I might add the functionality to change this in the future.

Notes

  • All computationally expensive or I/O intensive tasks are run asynchronously. This keeps the UI snappy. I'm currently using the DearPyGUI asynchronous call method wich will be depricated in the next version. A migration to python's out of the box threading library will be needed at that point.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

polybiblioglot-0.1.1.tar.gz (21.1 kB view details)

Uploaded Source

Built Distribution

polybiblioglot-0.1.1-py3-none-any.whl (22.6 kB view details)

Uploaded Python 3

File details

Details for the file polybiblioglot-0.1.1.tar.gz.

File metadata

  • Download URL: polybiblioglot-0.1.1.tar.gz
  • Upload date:
  • Size: 21.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.5 CPython/3.8.5 Darwin/20.3.0

File hashes

Hashes for polybiblioglot-0.1.1.tar.gz
Algorithm Hash digest
SHA256 bff02ccd668e9f65fc247e0095b4ba35f4f90dfa764bd17aab5ef81405d53bbc
MD5 2c34114bd576440e5dbf37666a6b8ee7
BLAKE2b-256 caa42b88b731ee4e2fb3ad74163a42ba09aac9d0dbea2f8ae12bf37068788d55

See more details on using hashes here.

File details

Details for the file polybiblioglot-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: polybiblioglot-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 22.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.5 CPython/3.8.5 Darwin/20.3.0

File hashes

Hashes for polybiblioglot-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 af8c9706fd868d0b618cf36b5dd245eddba3666b614423a4b73ec39bb45b3c08
MD5 ba1403a55565dbbc523e123e39008397
BLAKE2b-256 fcc816d842609dc18b3ce8f67ef3b4f7a1d30708b36044954469477bedfeea5f

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page