OCR-powered screen-capture tool to capture information instead of images
Project description
normcap
OCR powered screen-capture tool to capture information instead of images.
Links: Releases | Changelog | Roadmap | Repo
Content: Introduction |
Installation |
Usage |
Contribute |
Credits
Introduction
Basic usage:
- Launch
normcap
- Select region on screen
- Retrieve recognized text in clipboard
Installation
On Linux
1. Install dependencies:
## on Ubuntu/Debian:
sudo apt-get install tesseract-ocr xclip python3-tk python3-pil.imagetk
# on Arch:
sudo pacman -S tesseract tesseract-data-eng xclip tk python-pillow
# on Fedora
sudo dnf install tesseract xclip python3-tkinter
2. Install normcap:
## on Ubuntu/Debian:
pip3 install normcap
# on Arch:
pip install normcap
(OR download and extract binary package from the latest release)
3. Execute ./normcap
On Windows (recommended method)
1. Download and extract binary package from the latest release (no installation required)
2. Execute normcap.exe
On Windows (alternative method)
1. Install "Tesseract", e.g. by using the installer provided by UB Mannheim
2. Set the environment variable TESSDATA_PREFIX
to Tesseract's data folder, e.g.:
setx TESSDATA_PREFIX "C:\Program Files\Tesseract-OCR\tessdata"
3. Install tesserocr, e.g. by using the Windows specific wheel:
pip install https://github.com/simonflueckiger/tesserocr-windows_build/releases/download/tesserocr-v2.4.0-tesseract-4.0.0/tesserocr-2.4.0-cp37-cp37m-win_amd64.whl
4. Run
pip install normcap
5. Execute normcap
On Mac
Attention! On Mac not everything works. Help needed!
1. Install dependencies:
brew install tesseract tesseract-lang
2. Install normcap:
pip install normcap
(OR download and extract binary package from the latest release)
3. Execute normcap.app
Usage
General
-
After launching
normcap
press<esc>
to abort and quit. -
Before letting go the mouse button, press
<space>
-key to switch mode, as indicated by a symbol:- ☰ (raw): Copy detected text line by line, without further modification
- ☶ (parse): Try to auto-detect type of text using magics and format the text accordingly, then copy
-
To download additional languages for Mac and Linux, check the official repository of your distribution for
tesseract
-languages. Packages names might vary. -
The Windows release of normcap supports English and German out of the box. If you need additional languages, download the appropriate files from the tesseract repo and place them into the
/normcap/tessdata/
folder. -
normcap is intended to be executed on demand via keybinding or desktop shortcut. Therefore it doesn't occupy resources by running in the background, but it's startup is a bit slower.
-
By default normcap is "stateless": it copies recognized text to the systems clipboard, but doesn't save images or text on the disk. However, you can use the
--path
switch to store the images in any folder.
Command line options
normcap has no settings, just a set of command line arguments:
(normcap)dynobo@cioran:~$ normcap --help
usage: normcap [-h] [-v] [-m MODE] [-l LANG] [-c COLOR] [-p PATH]
OCR-powered screen-capture tool to capture information instead of images.
optional arguments:
-h, --help show this help message and exit
-v, --verbose print debug information to console (default: False)
-m MODE, --mode MODE startup mode [raw,parse] (default: parse)
-l LANG, --lang LANG languages for ocr, e.g. eng+deu (default: eng)
-c COLOR, --color COLOR set primary color for UI (default: #FF0000)
-p PATH, --path PATH set a path for storing images (default: None)
Magics
"Magics" are like addons providing automated functionality to intelligently detect and format the captured input.
First, every "magic" calculates a "score" to determine the likelihood of the magic being responsible for this type of text. Second, the "magic" which achieved the highest "score" take the necessary actions to "transform" the input text according to its type.
Currently implemented Magics:
Magic | Score | Transform |
---|---|---|
Single line | Only single line is detected | Trim unnecessary whitespace |
Multi line | Multi lines, but single Paragraph | Separated by line break and trim each lined |
Paragraph | Multiple blocks of lines or multiple paragraphs | Join every paragraph into single line, separate different paragraphs by empty line |
Number of chars in email addresses vs. overall chars | Transform to comma separated list of email addresses | |
URL | Number of chars in URLs vs. overall chars | Transform to line-break separated URLs |
Why "normcap"?
See XKCD:
Contribute
Setup Environment
Prerequisites are Python, Tesseract (incl. language data) and on Linux also XClip.
# Clone repository
git clone https://github.com/dynobo/normcap.git
# Change into project directory
cd normcap
# Install pipenv (if not already installed)
pip install pipenv
# Install project development incl. dependencies
pipenv install --dev
# Register pre-commit hook
pipenv run pre-commit install -t pre-commit
# Run normcap in pipenv environment
pipenv run python -m normcap
Design Principles
- Multi-Platform
Should work on on Linux, Mac & Windows. - Don't run as service
As normcap is (hopefully) not used too often, it shouldn't consume resources in the background, even if it leads to a slower start-up time. - No network connection
Everything should run locally without any network communication. - Avoid text in UI
This just avoids translations ;-) And I think it is feasible in such an simple application. - Avoid configuration file or settings UI
Focus on simplicity and core functionality. - Dependencies
The less dependencies, the better. Of course I have to compromise, but I'm always open to suggestions on how to further reduce dependencies. - Chain of Responsibility as main design pattern
See description on refactoring.guru - Multi-Monitors
Supports setups with two or more display.
Credits
This projected uses the following non-standard libraries:
- mss - taking screenshots
- pillow - manipulating images
- tesserocr - wrapper for tesseract's API
- pyperclip - accessing clipboard
- pyinstaller - packaging for platforms
And it depends on external software
- tesseract - OCR engine
Thanks to the maintainers of those nice libraries!
Certification
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.