Perform OCR using Google's Drive API v3
Project description
Google OCR (Drive API v3)
Perform OCR using Google’s Drive API v3
Free software: GNU General Public License v3
Documentation: https://google-drive-ocr.readthedocs.io.
Usage
To use google_drive_ocr in a project:
from google_drive_ocr.application import GoogleOCRApplication
app = GoogleOCRApplication('client_secret.json')
# Single image
app.perform_ocr('image.png')
# Multiple images
app.perform_batch_ocr(['image_1.png', 'image_2.png', 'image_3.png'])
# Multiple Images using multiprocessing
app.perform_batch_ocr(['image_1.png', 'image_3.png', 'image_2.png'], workers=2)
To use google_drive_ocr from command line:
google-ocr --client-secret client_secret.json \ --upload-folder-id <google-drive-folder-id> \ --image-dir images/ --extension .jpg \ --workers 4 --no-keep # Save configuration and exit # If configuration is written to ~/.gdo.cfg, we don't have to specify those # options again on the subsequent runs google-ocr --client-secret client_secret.json --write-config ~/.gdo.cfg # Read configuration from a custom location (if it was written to a custom location) google-ocr --config ~/.my_config_file .. # Examples (assuming client-secret is saved in configuration file) # Single image google-ocr -i image.png # Multiple images google-ocr -b image_1.png image_2.png image_3.png # All files from a directory google-ocr --image-dir images/ --extension .png # Multiple images using multiprocessing google-ocr -b image_1.png image_2.png image_3.png --workers 2 # PDF files google-ocr --pdf document.pdf --pages 1-3 5 7-10 13 # For more detailed Usage google-ocr --help
Note:
You must setup a Google application and download client_secrets.json file before using google_drive_ocr.
Setup Instructions
Create a project on Google Cloud Platform
Wizard: https://console.developers.google.com/start/api?id=drive
Instructions:
https://cloud.google.com/genomics/downloading-credentials-for-api-access
Select application type as “Installed Application”
Create credentials OAuth consent screen –> OAuth client ID
Save
client_secret.json
Features
Perform OCR using Google’s Drive API v3
Single, Batch and Parallel OCR
Work on a PDF document directly
Highly configurable CLI
GoogleOCRApplicationclass usable in a project
Credits
This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.
History
0.2.0 (2021-06-29)
PDF file support
0.1.0 (2021-06-14)
First release on PyPI.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file google_drive_ocr-0.2.2.tar.gz.
File metadata
- Download URL: google_drive_ocr-0.2.2.tar.gz
- Upload date:
- Size: 19.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.1 importlib_metadata/4.8.2 pkginfo/1.5.0.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.0 CPython/3.8.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bae2a0c05f9347f7746fa75a55bed983378c347f14d92b639c763e0c7dc53e26
|
|
| MD5 |
94aa5c40dbcf0471aaf02de4b2d490ea
|
|
| BLAKE2b-256 |
bdf6ad86c13ab3730c5371f86c8e6db30e0caed4d4edb58af50f0f24a88124cd
|
File details
Details for the file google_drive_ocr-0.2.2-py2.py3-none-any.whl.
File metadata
- Download URL: google_drive_ocr-0.2.2-py2.py3-none-any.whl
- Upload date:
- Size: 12.5 kB
- Tags: Python 2, Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.1 importlib_metadata/4.8.2 pkginfo/1.5.0.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.0 CPython/3.8.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c5e193445b2ab9ec7268e3888107fe1429fc17f215214a183987306e52f1206b
|
|
| MD5 |
f3278cb05bce31a1bbf8293fe04bb3bf
|
|
| BLAKE2b-256 |
7ced6dd9f0207bbbf09b8b974b1912fcdd5703f312ff3f1e3b235dd5b0b5f44b
|