Skip to main content

A decoupled OCR helper library using PaddleOCR (via remote server) and SQLite caching.

Project description

Vibe-OCR

An intelligent, decoupled OCR helper library designed for automation tasks. It leverages a remote PaddleOCR server for text recognition and includes a robust local caching system using SQLite to optimize performance and reduce repeated network requests.

Installation

pip install vibe-ocr

Features

  • Decoupled Architecture: Uses a remote OCR server (PaddleOCR) to offload heavy computation.
  • Smart Caching: Local SQLite database caches OCR results for identical images/regions, significantly speeding up repeated checks.
  • Snapshot Integration: Built-in support for taking screenshots (via airtest by default) or custom snapshot functions.
  • Retry Logic: Automatic retry without cache if text is not found initially.
  • Debug Friendly: Options to save debug images with detected regions.

Usage Guide

1. Initialization

Initialize the OCRHelper. You can customize the output directory for logs/cache and inject a custom snapshot function (useful for testing or non-Airtest environments).

from vibe_ocr import OCRHelper

# Basic initialization
ocr = OCRHelper(output_dir="output")

# Custom snapshot function (e.g., for testing or different frameworks)
def my_snapshot_func(filename):
    # logic to save screenshot to filename
    pass

ocr = OCRHelper(output_dir="output", snapshot_func=my_snapshot_func)

2. Finding Text

The most common operation is to capture a screen and find specific text.

# Capture screen and find text "Login"
result = ocr.capture_and_find_text(
    "Login",
    confidence_threshold=0.7,
    occurrence=1,   # 1st occurrence
    use_cache=True  # Use cache if screen hasn't changed
)

if result and result.get("found"):
    print(f"Found 'Login' at: {result['center']}")
    print(f"Bounding Box: {result['bbox']}")
else:
    print("Text not found.")

3. Finding and Clicking

A convenience method to find text and simulate a touch/click action (requires airtest or compatible environment).

# Find "Confirm" and click it if found
clicked = ocr.find_and_click_text(
    "Confirm",
    confidence_threshold=0.6
)

4. Advanced: Batch OCR & Regions

You can optimize performance by searching only within specific regions.

# Search only in the top-left region [x1, y1, x2, y2]
ocr.capture_and_find_text("Player Name", regions=[0, 0, 200, 100])

Configuration

Environment Variables

  • OCR_SERVER_URL: The URL of the PaddleOCR server. Defaults to http://localhost:8080/ocr.

Constructor Parameters

  • output_dir: Directory to store cache (sqlite db) and debug images.
  • snapshot_func: Callable to take screenshots. Defaults to airtest.core.api.snapshot.
  • delete_temp_screenshots: Whether to delete temporary screenshot files after processing (Default: True).
  • resize_image: Resize large images before sending to OCR server to improve speed (Default: True).

Caching Mechanism

vibe-ocr calculates a perceptual hash (dhash) of the screenshot. If a similar image exists in the sqlite cache, it retrieves the OCR result locally instead of calling the server. This is critical for high-frequency loops in automation scripts.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vibe_ocr-0.1.2.tar.gz (122.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

vibe_ocr-0.1.2-py3-none-any.whl (14.4 kB view details)

Uploaded Python 3

File details

Details for the file vibe_ocr-0.1.2.tar.gz.

File metadata

  • Download URL: vibe_ocr-0.1.2.tar.gz
  • Upload date:
  • Size: 122.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for vibe_ocr-0.1.2.tar.gz
Algorithm Hash digest
SHA256 c813a793e2b05fbb8e4e406a5eb0d8a6cbadfda2947ecd2314ee657f848cfde1
MD5 63a8920deeabfa78da0aa79b300c71c6
BLAKE2b-256 040e47cbb250523567905d1425e5ddf2e48df2f5e753ffdfd19d3eea4094bfce

See more details on using hashes here.

Provenance

The following attestation bundles were made for vibe_ocr-0.1.2.tar.gz:

Publisher: pypi-vibe-ocr.yaml on jasoft/pythonlib

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file vibe_ocr-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: vibe_ocr-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 14.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for vibe_ocr-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 a7d876bed4e658c1a999514e365cdc98bed4f978c0e7cee1b1ea341913efea59
MD5 ef6a489d3ca39cd996b2096201a8a074
BLAKE2b-256 8e50c0f31128d221d910bf594936db6d60a89c04013ab1da06a71dcdd041ee5d

See more details on using hashes here.

Provenance

The following attestation bundles were made for vibe_ocr-0.1.2-py3-none-any.whl:

Publisher: pypi-vibe-ocr.yaml on jasoft/pythonlib

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page