Skip to main content

Japanese OCR

Project description

OwOCR

Command line client for several Japanese OCR providers derived from Manga OCR.

Installation

This has been tested with Python 3.11. Newer/older versions might work. It can be installed with pip install owocr

Supported providers

Local providers

  • Manga OCR: refer to the readme for installation ("m" key)
  • EasyOCR: refer to the readme for installation ("e" key)
  • RapidOCR: refer to the readme for installation ("r" key)
  • Apple Vision framework: this will work on macOS Ventura or later. In my experience, the best of the local providers for horizontal text ("a" key)
  • WinRT OCR: this will work on Windows 10 or later if winocr (pip install winocr) is installed. It can also be used by installing winocr on a Windows virtual machine and running the server (winocr_serve), installing requests (pip install requests) and specifying the IP address of the Windows VM/machine in the config file (see below) ("w" key)

Cloud providers

  • Google Lens: Google Vision in disguise (no need for API keys!), however it needs to download a couple megabytes of data for each request. You need to install pyjson5 and requests (pip install pyjson5 requests) ("l" key)
  • Google Vision: you need a service account .json file named google_vision.json in user directory/.config/ and installing google-cloud-vision (pip install google-cloud-vision) ("g" key)
  • Azure Image Analysis: you need to specify an api key and an endpoint in the config file (see below) and to install azure-ai-vision-imageanalysis (pip install azure-ai-vision-imageanalysis) ("v" key)

Usage

It mostly functions like Manga OCR: https://github.com/kha-white/manga-ocr?tab=readme-ov-file#running-in-the-background However:

  • it supports reading images and/or writing text to a websocket when the -r=websocket and/or -w=websocket parameters are specified (port 7331 by default, configurable in the config file)
  • it supports capturing the screen directly with -r screencapture. It will default to the entire first screen every 3 seconds, but a different screen/coordinates/window/delay can be specified in the config file. Instead of using a delay it's also possible to specify a keyboard combo (refer to the config file or the help page)
  • you can pause/unpause the image processing by pressing "p" or terminate the script with "t" or "q" in the terminal window
  • you can switch OCR provider pressing its corresponding keyboard key in the terminal window (refer to the list of keys above). You can also start the script paused with the -p option or with a specific provider with the -e option (refer to owocr -h for the list)
  • holding ctrl or cmd at any time will pause image processing temporarily, or you can specify keyboard combos in the config file to pause/unpause and switch the OCR provider (refer to the config file or the help page)
  • for systems where text can be copied to the clipboard at the same time as images, if *ocr_ignore* is copied with an image, the image will be ignored
  • optionally, notifications can be enabled in the config file to show the text with a native OS notification
  • optionally, you can speed up the online providers by installing fpng-py: pip install fpng-py (requires a developer environment on some operating systems/Python versions)
  • optionally, you can improve filtering of non-Japanese text for screen capture by installing transformers: pip install transformers
  • idle resource usage on macOS and Windows when reading from the clipboard has been eliminated using native OS polling
  • a config file (to be created in user directory/.config/owocr_config.ini, on Windows user directory is the C:\Users\yourusername folder) can be used to configure the script, as an example to limit providers (to reduce clutter/memory usage) as well as specifying provider settings such as api keys etc. A sample config file is provided here

Acknowledgments

This uses code from/references these projects:

Thanks to viola for working on the Google Lens implementation!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

owocr-1.5.1.tar.gz (20.1 kB view details)

Uploaded Source

Built Distribution

owocr-1.5.1-py3-none-any.whl (19.5 kB view details)

Uploaded Python 3

File details

Details for the file owocr-1.5.1.tar.gz.

File metadata

  • Download URL: owocr-1.5.1.tar.gz
  • Upload date:
  • Size: 20.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.7

File hashes

Hashes for owocr-1.5.1.tar.gz
Algorithm Hash digest
SHA256 44427a9f473c34ae4dda7aded0a8c5865c4463b2f47a82a3ae475f28f327e4f2
MD5 73d73ce83a4848a903891b5b743346bc
BLAKE2b-256 0f3e3608dbed1473d33b14844fbbf9d52cae05afaececc089c3e0b93cd12def3

See more details on using hashes here.

File details

Details for the file owocr-1.5.1-py3-none-any.whl.

File metadata

  • Download URL: owocr-1.5.1-py3-none-any.whl
  • Upload date:
  • Size: 19.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.7

File hashes

Hashes for owocr-1.5.1-py3-none-any.whl
Algorithm Hash digest
SHA256 78ae3ec7bc178ac4d87315cfe923edca15d339985710893535a64791ce5cd9da
MD5 b5ac415eb17936f99b643401ffd67f8d
BLAKE2b-256 c6cfadeea7bf6fdd99ec113e8ea2a636997bf6e3f2a8ff7b7c14e82921184b88

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page