Skip to main content

Vision extension for nano-wait automation

Project description

Nano-Wait-Vision — Visual Execution Extension

PyPI version License: MIT

nano-wait-vision is the official computer vision extension for nano-wait. It integrates visual awareness (OCR, icon detection, screen states) into the adaptive waiting engine, enabling deterministic, screen-driven automations.

[!IMPORTANT] Critical Dependency: This package DEPENDS on nano-wait. It does not replace nano-wait — it extends it.


🧠 What is Nano-Wait-Vision?

Nano-Wait-Vision is a deterministic vision engine for Python automation. Instead of waiting blindly with sleep(), it allows your code to wait for real visual conditions:

  • Text appearing on screen
  • Icons becoming visible
  • UI states changing

It is designed to work in strict cooperation with nano-wait:

Component Responsibility
⏱️ nano-wait When to check (adaptive pacing & CPU-aware waiting)
👁️ nano-wait-vision What to check (screen, OCR, icons)

🧩 Key Features

nano-wait-vision extends nano-wait with:

  • 👁️ OCR (Optical Character Recognition): Read real text directly from the screen.
  • 🖼️ Icon Detection: Template matching via OpenCV.
  • 🧠 Explicit Visual States: Each operation returns a structured VisionState.
  • 📚 Persistent & Explainable Diagnostics: No black-box ML models.
  • 🖥️ Screen-Based Automation: Ideal for RPA and GUI testing.
  • ⚡ Selenium / Pytest Adapters: Immediate adoption in corporate or academic QA workflows.

[!TIP] All waiting logic is delegated to nano-wait.wait() — never time.sleep().


🚀 Quick Start

Installation

pip install nano-wait
pip install nano-wait-vision

Simple Visual Observation

from nano_wait_vision import VisionMode

vision = VisionMode()
state = vision.observe()

print(f"Detected: {state.detected}")
print(f"Text: {state.text}")

Wait for Text to Appear

from nano_wait_vision import VisionMode

vision = VisionMode(verbose=True)

# Wait up to 10 seconds for the word "Welcome"
state = vision.wait_text("Welcome", timeout=10)

if state.detected:
    print("Text detected!")

Wait for an Icon

from nano_wait_vision import VisionMode

vision = VisionMode()

# Wait up to 10 seconds for an icon image
state = vision.wait_icon("ok.png", timeout=10)

if state.detected:
    print("Icon found on screen.")

⚠️ Installation & Dependencies

This library interacts directly with your operating system screen and OCR engine.

Python Dependencies (auto-installed)

  • opencv-python
  • pytesseract
  • pyautogui
  • numpy

🧠 Mandatory External Dependency — Tesseract OCR

OCR will not work unless Tesseract is installed and available in your PATH.

OS Command / Action
macOS brew install tesseract
Ubuntu / Debian sudo apt install tesseract-ocr
Windows Download from the official Tesseract repo and add to PATH

[!WARNING] If Tesseract is missing, OCR calls will silently fail or return empty text.


🧠 Mental Model — How It Works

Nano-Wait-Vision follows this loop: observe → evaluate → wait → observe.

Two engines cooperate:

👁️ Vision Engine ⏱️ nano-wait
OCR / Icons Adaptive timing
Screen capture CPU-aware waits
Visual states Smart pacing

Vision never sleeps. All delays are handled by nano-wait.


📦 VisionState — Return Object

Every visual operation returns a VisionState object:

VisionState(
    name: str,
    detected: bool,
    confidence: float,
    attempts: int,
    elapsed: float,
    text: Optional[str],
    icon: Optional[str],
    diagnostics: dict
)

Always check detected before acting on the result.


🧪 Diagnostics & Debugging

Nano-Wait-Vision supports verbose diagnostics:

vision = VisionMode(verbose=True)
state = vision.wait_text("Terminal")

Diagnostics include:

  • Attempts per phase
  • Confidence scores
  • Elapsed time
  • Reason for failure

A full macOS diagnostic test is provided in test_screen.py, generating debug screenshots for inspection.


🖥️ Platform Notes

macOS (Important)

  • Screen capture requires Screen Recording permission.
  • OCR requires RGB images (internally handled by Nano-Wait-Vision).
  • Fully tested on macOS Retina displays.

Windows & Linux

  • Works out of the box.
  • Ensure correct DPI scaling on Windows for accurate coordinate mapping.

🧪 Ideal Use Cases

Use Nano-Wait-Vision when dealing with:

  • RPA (Robotic Process Automation)
  • GUI automation and testing
  • OCR-driven workflows
  • Visual regression tests
  • Applications without APIs
  • Screen-based alternatives to Selenium

🧩 Design Philosophy

  • Deterministic: Predictable behavior based on visual truth.
  • Explainable: Clear diagnostics for every action.
  • No opaque ML: Uses reliable computer vision techniques.
  • System-aware: Respects system resources via nano-wait.
  • Debuggable by design: Built-in tools for troubleshooting.

🧪 Selenium / Pytest Integration

Selenium-style Visual Waits

from nano_wait_vision.selenium import VisionWait

wait = VisionWait(timeout=15)
wait.until_text("Dashboard")
wait.until_icon("ok.png")

Pytest Fixture

def test_homepage(vision):
    assert vision.wait_text("Welcome")

Pytest fixture is available via nano_wait_vision.pytest_fixture.vision


📄 License

This project is licensed under the MIT License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nano_wait_vision-0.3.0.tar.gz (10.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

nano_wait_vision-0.3.0-py3-none-any.whl (9.2 kB view details)

Uploaded Python 3

File details

Details for the file nano_wait_vision-0.3.0.tar.gz.

File metadata

  • Download URL: nano_wait_vision-0.3.0.tar.gz
  • Upload date:
  • Size: 10.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.7

File hashes

Hashes for nano_wait_vision-0.3.0.tar.gz
Algorithm Hash digest
SHA256 9ef7aadf2a89e70743eb9592b397f4ef4401d5242aacb1a1fccd65550bf40e88
MD5 20b39d5e5aec0cebf3a9ecd24c67e4f8
BLAKE2b-256 c432b1c7840fa29635c1bf24adee79b019665ca00208d2b5d76c0ba1dbbadbc4

See more details on using hashes here.

File details

Details for the file nano_wait_vision-0.3.0-py3-none-any.whl.

File metadata

File hashes

Hashes for nano_wait_vision-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 a99a7c9b00297cbb65c910c5c18ed9853f6f076327bed1896a6f03e235668157
MD5 f11ea68f24f610c389831746612f66f3
BLAKE2b-256 f92eba8e210bd37a69f2266018e44b1d384e0b60eaa8f32ab6de373c7b9b98a8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page