A Python package for detecting and interacting with screen elements using computer vision and OCR.

These details have not been verified by PyPI

Project links

Homepage

Intended Audience
- Developers
Natural Language
- English
Programming Language
- Python :: 3
- Python :: 3.9

Project description

Screenwise Framework

A Python framework for screen element detection and interaction using computer vision and machine learning.

Overview

Screenwise provides automated detection and interaction with UI elements through:

Screenshot capture and analysis
ML-based element detection
Coordinate-based interaction
OCR capabilities
Debug and capture modes
Cross-platform support

Installation

.. code-block:: bash

pip install screenwise

Basic Usage

Initialize Framework

.. code-block:: python

    from t_screenwise.screenwise import Framework

    # Initialize with default settings
    framework = Framework()

    # Initialize with custom settings
    framework = Framework(
        mode="CAPTURE",
        model_path="path/to/model.pth",
        labels="path/to/labels.json",
        device="cpu"
    )

Detect Elements
~~~~~~~~~~~~~~
.. code-block:: python

    # Get all detected elements
    elements = framework.get()

    # Filter for specific element types
    buttons = framework.get(filter=["button"])
    text = framework.get(filter=["text"])

Interact with Elements

.. code-block:: python

# Click element
element.click()

# Click at specific position
element.click(coords="up_right")

# Type text
element.send_keys("Hello World")

# Click and type
element.click_and_send_keys("Hello World")

Process OCR Elements

.. code-block:: python

    framework = Framework()
    results = framework.get(image="path/to/image.png", process_ocr=True)

    # Work with both types of elements
    for element in results:
        if isinstance(element, OCRElement):
            print(f"OCR Text: {element.text} (Confidence: {element.confidence})")
        else:
            print(f"Box Label: {element.label}")

OCR Elements
~~~~~~~~~~~
* Text content extraction
* Confidence scoring
* Spatial relationship analysis
* Text-based element search

OCR Spatial Analysis

The OCRElement class provides powerful spatial analysis capabilities through the get_nearest_boxes method:

.. code-block:: python

# Get OCR elements from an image
ocr_elements = framework.get(image="screenshot.png", process_ocr=True)

# For a specific OCR element, find nearest elements in all directions
nearest = ocr_element.get_nearest_boxes(ocr_elements, n=1)

# Access nearest elements by direction
right_element = nearest["right"][0]  # Nearest element to the right
left_element = nearest["left"][0]    # Nearest element to the left
above_element = nearest["above"][0]   # Nearest element above
below_element = nearest["below"][0]   # Nearest element below

Features:

Find n nearest elements in each direction (right, left, above, below)
Considers spatial overlap when determining nearest elements
Returns elements sorted by distance
Useful for understanding layout and relationships between text elements

Features

Screen Elements

* Coordinate-based positioning
* Margin calculations
* Drawing capabilities

Mouse and keyboard interaction

Debug visualization

Operating Modes

* CAPTURE: Live interaction with screen elements
* DEBUG: Visualization and testing without actual interaction

Configuration
------------

Labels
~~~~~~
Labels are defined in a JSON file mapping element types to numeric IDs:

.. code-block:: json

    {
        "button": 1,
        "text": 2,
        "input": 3
        // etc...
    }

Model
~~~~~
Supports custom trained object detection models:

* Default model trained for common UI elements
* Configurable confidence thresholds

Contributing
-----------
1. Clone the repository
2. Create a feature branch
3. Commit changes
4. Push to branch
5. Create Pull Request

Project details

These details have not been verified by PyPI

Project links

Homepage

Intended Audience
- Developers
Natural Language
- English
Programming Language
- Python :: 3
- Python :: 3.9

Release history Release notifications | RSS feed

This version

1.0.3

Jan 15, 2025

1.0.2

Jan 14, 2025

1.0.1

Jan 13, 2025

1.0.0

Jan 13, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

t_screenwise-1.0.3.tar.gz (16.4 kB view details)

Uploaded Jan 15, 2025 Source

File details

Details for the file t_screenwise-1.0.3.tar.gz.

File metadata

Download URL: t_screenwise-1.0.3.tar.gz
Upload date: Jan 15, 2025
Size: 16.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.0.1 CPython/3.9.0

File hashes

Hashes for t_screenwise-1.0.3.tar.gz
Algorithm	Hash digest
SHA256	`94347fb85673ab7a02990c2c4a91c7092da196793daf4cf0bab18c18f9cdd321`
MD5	`247249d4010c0de4e5764a23c4cb80a9`
BLAKE2b-256	`c3192bd507486dc28fde70fa4d3cb6d69f092c3f6e68ccb11d9d072ac3c468e9`

See more details on using hashes here.

t-screenwise 1.0.3

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Screenwise Framework

Overview

Installation

Basic Usage

Features

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

File details

File metadata

File hashes