A decoupled OCR helper library using PaddleOCR (via remote server) and SQLite caching.
Project description
Vibe-OCR
An intelligent, decoupled OCR helper library designed for automation tasks. It leverages a remote PaddleOCR server for text recognition and includes a robust local caching system using SQLite to optimize performance and reduce repeated network requests.
Installation
pip install vibe-ocr
Features
- Decoupled Architecture: Uses a remote OCR server (PaddleOCR) to offload heavy computation.
- Smart Caching: Local SQLite database caches OCR results for identical images/regions, significantly speeding up repeated checks.
- Snapshot Integration: Built-in support for taking screenshots (via
airtestby default) or custom snapshot functions. - Retry Logic: Automatic retry without cache if text is not found initially.
- Debug Friendly: Options to save debug images with detected regions.
Usage Guide
1. Initialization
Initialize the OCRHelper. You can customize the output directory for logs/cache and inject a custom snapshot function (useful for testing or non-Airtest environments).
from vibe_ocr import OCRHelper
# Basic initialization
ocr = OCRHelper(output_dir="output")
# Custom snapshot function (e.g., for testing or different frameworks)
def my_snapshot_func(filename):
# logic to save screenshot to filename
pass
ocr = OCRHelper(output_dir="output", snapshot_func=my_snapshot_func)
2. Finding Text
The most common operation is to capture a screen and find specific text.
# Capture screen and find text "Login"
result = ocr.capture_and_find_text(
"Login",
confidence_threshold=0.7,
occurrence=1, # 1st occurrence
use_cache=True # Use cache if screen hasn't changed
)
if result and result.get("found"):
print(f"Found 'Login' at: {result['center']}")
print(f"Bounding Box: {result['bbox']}")
else:
print("Text not found.")
3. Finding and Clicking
A convenience method to find text and simulate a touch/click action (requires airtest or compatible environment).
# Find "Confirm" and click it if found
clicked = ocr.find_and_click_text(
"Confirm",
confidence_threshold=0.6
)
4. Advanced: Batch OCR & Regions
You can optimize performance by searching only within specific regions.
# Search only in the top-left region [x1, y1, x2, y2]
ocr.capture_and_find_text("Player Name", regions=[0, 0, 200, 100])
Configuration
Environment Variables
OCR_SERVER_URL: The URL of the PaddleOCR server. Defaults tohttp://localhost:8080/ocr.
Constructor Parameters
output_dir: Directory to store cache (sqlite db) and debug images.snapshot_func: Callable to take screenshots. Defaults toairtest.core.api.snapshot.delete_temp_screenshots: Whether to delete temporary screenshot files after processing (Default:True).resize_image: Resize large images before sending to OCR server to improve speed (Default:True).
Caching Mechanism
vibe-ocr calculates a perceptual hash (dhash) of the screenshot. If a similar image exists in the sqlite cache, it retrieves the OCR result locally instead of calling the server. This is critical for high-frequency loops in automation scripts.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file vibe_ocr-0.1.2.tar.gz.
File metadata
- Download URL: vibe_ocr-0.1.2.tar.gz
- Upload date:
- Size: 122.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c813a793e2b05fbb8e4e406a5eb0d8a6cbadfda2947ecd2314ee657f848cfde1
|
|
| MD5 |
63a8920deeabfa78da0aa79b300c71c6
|
|
| BLAKE2b-256 |
040e47cbb250523567905d1425e5ddf2e48df2f5e753ffdfd19d3eea4094bfce
|
Provenance
The following attestation bundles were made for vibe_ocr-0.1.2.tar.gz:
Publisher:
pypi-vibe-ocr.yaml on jasoft/pythonlib
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
vibe_ocr-0.1.2.tar.gz -
Subject digest:
c813a793e2b05fbb8e4e406a5eb0d8a6cbadfda2947ecd2314ee657f848cfde1 - Sigstore transparency entry: 836076821
- Sigstore integration time:
-
Permalink:
jasoft/pythonlib@8214335ec782678d11bba943fb757a1bac3d51c1 -
Branch / Tag:
refs/tags/vibe-ocr-v0.1.2 - Owner: https://github.com/jasoft
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
pypi-vibe-ocr.yaml@8214335ec782678d11bba943fb757a1bac3d51c1 -
Trigger Event:
push
-
Statement type:
File details
Details for the file vibe_ocr-0.1.2-py3-none-any.whl.
File metadata
- Download URL: vibe_ocr-0.1.2-py3-none-any.whl
- Upload date:
- Size: 14.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a7d876bed4e658c1a999514e365cdc98bed4f978c0e7cee1b1ea341913efea59
|
|
| MD5 |
ef6a489d3ca39cd996b2096201a8a074
|
|
| BLAKE2b-256 |
8e50c0f31128d221d910bf594936db6d60a89c04013ab1da06a71dcdd041ee5d
|
Provenance
The following attestation bundles were made for vibe_ocr-0.1.2-py3-none-any.whl:
Publisher:
pypi-vibe-ocr.yaml on jasoft/pythonlib
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
vibe_ocr-0.1.2-py3-none-any.whl -
Subject digest:
a7d876bed4e658c1a999514e365cdc98bed4f978c0e7cee1b1ea341913efea59 - Sigstore transparency entry: 836076824
- Sigstore integration time:
-
Permalink:
jasoft/pythonlib@8214335ec782678d11bba943fb757a1bac3d51c1 -
Branch / Tag:
refs/tags/vibe-ocr-v0.1.2 - Owner: https://github.com/jasoft
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
pypi-vibe-ocr.yaml@8214335ec782678d11bba943fb757a1bac3d51c1 -
Trigger Event:
push
-
Statement type: