Skip to main content

Image cleaning and OCR improvement package in Python using OpenCV.

Project description

# Athento-imaging

Athento-Imaging is a package developed using Python and OpenCV to improve OCR in documents. Among the documents tested using this package are: passports, bills, delivery notes, budgets, and other common documents.

This package includes several functions to transform images:

  • Remove coloured background.

  • Remove “salt and pepper” noise.

  • Line detection in documents (two approachs).

  • Remove lines in documents.

  • Simple line analysis (which lines are horizontal and vertical, distance between lines, etc.

  • Template matching improved using pyramid transformations.

You can check everything out here: [Athento-Imaging Summary](<docs/SUMMARY.md>)

The quality of the output and it’s OCR performance will depend on:

  • The quality of the source document, as the quality value increases so does the OCR.

  • The amount of noise in the document and where it’s located.

  • The location of the document’s watermarks (if any).

  • The colour of the document. Clear colours are easier to remove than darker colours due to the proximity of the pixel values between the background and the text.

  • Your personal experience in image transformation. As you might need to perform a combination of operations or change the parameters values significantly.

Project details


Release history Release notifications | RSS feed

This version

0.1

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

athentoimaging-0.1.tar.gz (400.4 kB view details)

Uploaded Source

File details

Details for the file athentoimaging-0.1.tar.gz.

File metadata

File hashes

Hashes for athentoimaging-0.1.tar.gz
Algorithm Hash digest
SHA256 4ff1b2c1ffca0a351658b2c126a88c55c1ad7b55afa9035da0256de833f3b4b1
MD5 df33b6cdd7390ce009e560adcdf0269b
BLAKE2b-256 6e738a37065419b3611e339fa9f9af61c210f63dac90b26ceb97eec5ec11e3c0

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page