Skip to main content

Vision extension for nano-wait automation

Project description

👁️ Nano-Wait-Vision — Visual Execution Extension

PyPI Version License

nano-wait-vision is the official computer vision extension for nano-wait. It integrates visual awareness capabilities (such as OCR, icon detection, and screen states) into the adaptive waiting engine, enabling more robust and deterministic automations.

📦 Critical Dependency: This package DEPENDS on nano-wait. It does not replace nano-wait, but rather extends it.


🧭 Table of Contents

  1. What is Nano-Wait-Vision?
  2. Added Features
  3. Quick Start
  4. Installation & Dependencies (READ THIS)
  5. Mental Model and Operation
  6. VisionState — Visual Operation Return
  7. Ideal Use Cases
  8. Design Philosophy
  9. Relationship with nano-wait

🧠 What is Nano-Wait-Vision?

Nano-Wait-Vision is a deterministic vision engine designed for Python automation. Its main function is to allow scripts to wait for real visual conditions on the screen, instead of relying solely on time-based waits.

It was developed to operate in conjunction with nano-wait, establishing a clear division of responsibilities:

Component Function
⏱️ nano-wait When to check (Manages the pace and adaptive waiting)
👁️ nano-wait-vision What to check (Provides visual awareness)

🧩 Added Features

nano-wait-vision extends the nano-wait waiting engine with the following visual capabilities:

  • 👁️ OCR (Optical Character Recognition): Reading text directly from the screen.
  • 🖼️ Icon Detection: Locating visual elements (template matching) with high precision.
  • 🧠 Explicit Visual States: Defining and waiting for specific graphical interface states.
  • 📚 Persistent Visual Memory: Uses efficient computer vision techniques, without the need for heavy Machine Learning models.
  • 🖥️ Screen-Based Automation: Ideal for RPA (Robotic Process Automation) tasks and GUI testing.

👉 All these features use the adaptive waiting engine of nano-wait as their foundation.


🚀 Quick Start

Installation

First, install the main package (nano-wait), and then the vision extension:

pip install nano-wait
pip install nano-wait-vision

Usage Examples

Simple Visual Observation

from nano_wait_vision import VisionMode

vision = VisionMode()
state = vision.observe()

print(state.detected, state.text)

Wait for Text to Appear on Screen

from nano_wait_vision import VisionMode

vision = VisionMode()
state = vision.wait_text("Welcome", timeout=10)

if state.detected:
    print("Text 'Welcome' detected on screen.")

Wait for Icon to Appear

from nano_wait_vision import VisionMode

vision = VisionMode()
# Assumes 'ok.png' is an image file of the icon to be searched
state = vision.wait_icon("ok.png", timeout=10)

if state.detected:
    print("Icon 'ok.png' detected on screen.")

⚠️ Installation & Dependencies (READ THIS)

Nano-Wait-Vision is not a lightweight library, as it depends on graphical automation and OCR at the operating system level.

Python Dependencies (via pip)

The following dependencies are automatically installed:

  • opencv-python
  • pytesseract
  • pyautogui
  • numpy

🧠 Mandatory External Dependency (OCR)

👉 Tesseract OCR must be installed and accessible in the operating system's PATH for the OCR functionality to work.

Operating System Installation Command
macOS brew install tesseract
Ubuntu / Debian sudo apt install tesseract-ocr
Windows Download from the official Tesseract website and add to PATH.

⚠️ Warning: Without Tesseract installed, any OCR functionality will fail immediately.


🧠 Mental Model — How nano-wait-vision Works

nano-wait-vision does not execute in isolation. It operates within the conceptual cycle of nano-wait:

observereasonwaitobserve

It integrates two cooperative engines:

👁️ Vision Engine (What is happening?) ⏱️ nano-wait (When to check?)
OCR (text) CPU
Icons (template matching) Memory
Explicit visual states Smart Mode
Adaptive pace

👉 The vision engine never executes time.sleep() directly. It always delegates the pace and waiting to nano-wait.


VisionState — Visual Operation Return

All visual operations return a VisionState object with the observation information:

VisionState(
    name: str,
    detected: bool,
    confidence: float,
    text: Optional[str],
    icon: Optional[str]
)

👉 Always validate the detected field in critical automations to ensure the visual condition has been met.


🧪 Ideal Use Cases

Use nano-wait-vision if your project involves:

  • RPA (Robotic Process Automation)
  • GUI (Graphical User Interface) Automation
  • Visual Testing (ensuring elements appear correctly)
  • OCR-Driven Workflows
  • Systems without an API (where the only interface is the screen)
  • Screen-Based Alternatives to Selenium

If you only need smart waiting based on time and system resources, use nano-wait alone.


🧩 Design Philosophy

The project adheres to the following principles:

  • Deterministic: Predictable and consistent results.
  • No Opaque ML: Avoids complex and hard-to-debug Machine Learning models.
  • Reproducible: Automations must be repeatable across different environments.
  • Explainable: The vision state must be easy to understand and track.
  • Based on Real Screen State: Operates directly on what the user sees.
  • Integrated into System Context: Works in harmony with the adaptive waiting engine.

📌 Relationship with nano-wait

Project Description
nano-wait Main product, adaptive waiting engine.
nano-wait-vision Official vision extension, adds visual capabilities.

Both are published separately on PyPI but are designed to work as a unified system.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nano_wait_vision-0.1.0.tar.gz (7.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

nano_wait_vision-0.1.0-py3-none-any.whl (7.4 kB view details)

Uploaded Python 3

File details

Details for the file nano_wait_vision-0.1.0.tar.gz.

File metadata

  • Download URL: nano_wait_vision-0.1.0.tar.gz
  • Upload date:
  • Size: 7.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.7

File hashes

Hashes for nano_wait_vision-0.1.0.tar.gz
Algorithm Hash digest
SHA256 28fbd24d08fc7f8679d28f0e781ca94f787586ecd7d3179a083c63b949aa644e
MD5 64f2aefa4ce51d6bfb8c7c68eb88bc22
BLAKE2b-256 406a13d852119cac2c01101eabd6437b3ceecec50f3fb6751a74a9f03f1fa592

See more details on using hashes here.

File details

Details for the file nano_wait_vision-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for nano_wait_vision-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 609311b691e0caf15939eb546efc3e28b021ad256f2e94d8bcd1a5ac01125870
MD5 14a86d1f6cd96bf426220cc7958614ea
BLAKE2b-256 55ad7c9224256bae84ae61aa4d6e8808f79764fc5411684d98fb32f2025bdaef

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page