Perform OCR using Google's Drive API v3
Project description
Perform OCR using Google’s Drive API v3
Free software: GNU General Public License v3
Documentation: https://google-drive-ocr.readthedocs.io.
Features
Perform OCR using Google’s Drive API v3
Class
GoogleOCRApplication()for use in projectsHighly configurable CLI
Run OCR on a single image file
Run OCR on multiple image files
Run OCR on all images in directory
Use multiple workers (
multiprocessing)Work on a PDF document directly
Usage
Using in a Project
Create a GoogleOCRApplication application instance:
from google_drive_ocr import GoogleOCRApplication
app = GoogleOCRApplication('client_secret.json')
Perform OCR on a single image:
app.perform_ocr('image.png')
Perform OCR on mupltiple images:
app.perform_batch_ocr(['image_1.png', 'image_2.png', 'image_3.png'])
Perform OCR on multiple images using multiple workers (multiprocessing):
app.perform_batch_ocr(['image_1.png', 'image_3.png', 'image_2.png'], workers=2)
Using Command Line Interface
Typical usage with several options:
google-ocr --client-secret client_secret.json \
--upload-folder-id <google-drive-folder-id> \
--image-dir images/ --extension .jpg \
--workers 4 --no-keep
Show help message with the full set of options:
google-ocr --help
Configuration
The default location for configuration is ~/.gdo.cfg.
If configuration is written to this location with a set of options,
we don’t have to specify those options again on the subsequent runs.
Save configuration and exit:
google-ocr --client-secret client_secret.json --write-config ~/.gdo.cfg
Read configuration from a custom location (if it was written to a custom location):
google-ocr --config ~/.my_config_file ..
Performing OCR
Note: It is assumed that the client-secret option is saved in configuration file.
Single image file:
google-ocr -i image.png
Multiple image files:
google-ocr -b image_1.png image_2.png image_3.png
All image files from a directory with a specific extension:
google-ocr --image-dir images/ --extension .png
Multiple workers (multiprocessing):
google-ocr -b image_1.png image_2.png image_3.png --workers 2
PDF files:
google-ocr --pdf document.pdf --pages 1-3 5 7-10 13
Note:
You must setup a Google application and download client_secrets.json file before using google_drive_ocr.
Setup Instructions
Create a project on Google Cloud Platform
Wizard: https://console.developers.google.com/start/api?id=drive
Instructions:
https://cloud.google.com/genomics/downloading-credentials-for-api-access
Select application type as “Installed Application”
Create credentials OAuth consent screen –> OAuth client ID
Save
client_secret.json
History
0.2.0 (2021-06-29)
PDF file support
0.1.0 (2021-06-14)
First release on PyPI.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file google_drive_ocr-0.2.6.tar.gz.
File metadata
- Download URL: google_drive_ocr-0.2.6.tar.gz
- Upload date:
- Size: 20.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.1 importlib_metadata/3.10.1 pkginfo/1.5.0.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.0 CPython/3.8.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c47e4447d4ff15d68c145b72841be14775b6364a1ea381dff98e9ba502538234
|
|
| MD5 |
99d992ba014e1f40ad8fda1add2971dd
|
|
| BLAKE2b-256 |
4f8583101fdc3f197a2e153be116a8a1e0ab61a2c1371f6b49182b5958d4045c
|
File details
Details for the file google_drive_ocr-0.2.6-py2.py3-none-any.whl.
File metadata
- Download URL: google_drive_ocr-0.2.6-py2.py3-none-any.whl
- Upload date:
- Size: 13.8 kB
- Tags: Python 2, Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.1 importlib_metadata/3.10.1 pkginfo/1.5.0.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.0 CPython/3.8.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9ce3dd93eabc07ac3dd7d6b7902b14e089a2feb82dc7d92f174a226428f667c6
|
|
| MD5 |
4f587d1abedf23af4f05a3f50208fe4f
|
|
| BLAKE2b-256 |
69374f338f36c0f5583edc85b3a7ab9e52552e4d747ad6c68f9fe0d6b10792c5
|