A Python library for GUI automation with visual feedback

These details have not been verified by PyPI

Project links

Homepage

Development Status
- 3 - Alpha
Intended Audience
- Developers
License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language

Project description

VisionClick

A powerful Python library for GUI automation with image recognition and visual feedback capabilities.

Features

Image Recognition: Find and interact with UI elements using screenshots
Visual Feedback: Highlight elements before interaction for better visibility
Mouse & Keyboard Control: Full control over mouse and keyboard actions
Flexible API: Simple and intuitive method chaining
Cross-Platform: Works on Windows, macOS, and Linux

🔧 Installation

Clone the repository:

</code></pre>
</li>
</ol>
<p>git clone <a href="https://github.com/yourusername/VisionClick.git">https://github.com/yourusername/VisionClick.git</a>
cd VisionClick</p>
<pre><code>   git clone https://github.com/yourusername/VisionClick.git
   cd VisionClick

Install dependencies:
```
pip install -r requirements.txt
```

🚀 Quick Start

from visionclick import VisionClick

# Initialize with visual feedback enabled
with VisionClick(highlight_enabled=True) as vc:
    # Click on an image with visual highlight
    vc.click("button.png")
    
    # Type some text
    vc.type("Hello, VisionClick!")

✨ Visual Highlighting

VisionClick can highlight UI elements before interacting with them, making automation more visible and debuggable.

Basic Usage

# Enable highlighting for all clicks
vc = VisionClick(highlight_enabled=True, highlight_duration=2.0, highlight_color="yellow")

# Or enable/disable it later
vc.enable_highlight(True)  # Enable
vc.enable_highlight(False)  # Disable

# Change highlight settings
vc.enable_highlight(
    enabled=True,
    duration=1.5,  # seconds
    color="#FF5733"  # any valid color
)

Highlight Modes

Image Highlighting: When clicking on an image, VisionClick will draw a rectangle around the matched region
Coordinate Highlighting: When clicking on coordinates, it will show a small marker at the target position

📚 Core Features

Mouse Control

# Click at coordinates or on an image
vc.click((x, y))  # Coordinates
vc.click("button.png")  # Image

# Right click
vc.click("button.png", button="right")

# Double click
vc.double_click("icon.png")

# Drag and drop
vc.drag("item.png", (x, y))

Keyboard Control

# Type text
vc.type("Hello, World!")

# Press keys
vc.press("enter")
vc.hotkey("ctrl", "c")  # Copy

Image Recognition

# Check if image exists on screen
if vc.exists("button.png"):
    print("Button found!")

# Wait for image to appear
vc.wait_image("loading.png", timeout=10)

# Get image position
position = vc.locate("icon.png")
if position:
    print(f"Found at: {position['x']}, {position['y']}")

🛠 Advanced Usage

Conditional Actions

# Click only if image exists
vc.if_exists("popup.png", lambda: vc.click("close.png"))

# Repeat until image appears
vc.repeat_until(
    lambda: vc.click("next_page.png"),
    condition=lambda: vc.exists("last_page.png"),
    max_attempts=5
)

Debugging

# Enable debug logging
vc.enable_logs(True, "DEBUG")

# Take a screenshot
vc.screenshot("debug.png")

# Get the last screenshot
last_screenshot = vc.get_last_screenshot()

📝 Examples

Basic Automation

from visionclick import VisionClick

def test_automation():
    with VisionClick(highlight_enabled=True) as vc:
        # Open application (example: Notepad)
        vc.press("win")
        vc.type("notepad")
        vc.press("enter")
        
        # Type some text
        vc.type("Hello from VisionClick!")
        
        # Save the file
        vc.hotkey("ctrl", "s")
        vc.type("test.txt")
        vc.press("enter")

Web Automation

from visionclick import VisionClick
import time

def test_web():
    with VisionClick(highlight_duration=1.5) as vc:
        # Open browser and navigate to a website
        vc.press("win")
        vc.type("chrome")
        vc.press("enter")
        time.sleep(2)  # Wait for browser to open
        
        # Search in Google
        vc.type("VisionClick GitHub")
        vc.press("enter")
        
        # Click on first result (simplified example)
        vc.wait_image("search_result.png", timeout=10)
        vc.click("search_result.png")

🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

Requirements

Python 3.6+
OpenCV
PyAutoGUI
Pynput
Pillow
NumPy
Colorama (for colored output)

Basic Usage

from visionclick import VisionClick

# Initialize with context manager (recommended)
with VisionClick() as vc:
    vc.enable_logs(True, "INFO")  # Enable logging
    
    # Your automation code here
    vc.click((100, 100))
    vc.type("Hello, World!")

Core Functionality

Mouse Control

`click(target, button="left", clicks=1)`

Click at specified coordinates or on an image.

# Click at coordinates (x=100, y=200)
vc.click((100, 200))

# Right click on an image
vc.click("button.png", button="right")

# Double click
vc.click("icon.png", clicks=2)

`double_click(target, button="left")`

Double click at specified coordinates or on an image.

vc.double_click("file_icon.png")

`long_click(target, duration=1.0)`

Click and hold for a specified duration.

# Click and hold for 2 seconds
vc.long_click("slider_handle.png", duration=2.0)

`drag(start, end, duration=1.0, button="left")`

Drag from start to end position.

# Drag from (100,200) to (300,400) over 1 second
vc.drag((100, 200), (300, 400))

# Drag from one image to another
vc.drag("file.png", "trash_icon.png")

Keyboard Control

`type(text, interval=0.1)`

Type text with specified interval between keystrokes.

vc.type("Hello, World!")
vc.type("password123", interval=0.05)  # Faster typing

`press(key, presses=1, interval=0.1)`

Press a key one or more times.

# Press Enter
vc.press("enter")

# Press Tab 3 times quickly
vc.press("tab", presses=3, interval=0.05)

Image Recognition

`exists(image, confidence=0.8, log_not_found=True)`

Check if an image exists on screen.

if vc.exists("login_button.png"):
    vc.click("login_button.png")
else:
    print("Login button not found!")

`locate(image, confidence=0.8)`

Find the center coordinates of an image on screen.

position = vc.locate("icon.png")
if position:
    x, y = position
    print(f"Found icon at {x}, {y}")

`wait_image(image, timeout=5.0, interval=0.5, confidence=0.8)`

Wait for an image to appear on screen.

# Wait for loading to complete (max 10 seconds)
if vc.wait_image("loading_complete.png", timeout=10):
    print("Loading complete!")

`wait_image_vanish(image, timeout=5.0, interval=0.5, confidence=0.8)`

Wait for an image to disappear from the screen.

# Wait for loading spinner to disappear
if vc.wait_image_vanish("loading_spinner.png"):
    print("Loading finished!")

Conditional Actions

`if_exists(image, action, *args, confidence=0.8, **kwargs)`

Execute an action if the specified image exists.

def click_ok():
    vc.click("ok_button.png")

vc.if_exists("error_message.png", click_ok)

`repeat_until(condition_image, action, max_attempts=10, interval=1.0, confidence=0.8, *args, **kwargs)`

Repeat an action until a condition is met.

def click_next():
    vc.click("next_button.png")
    time.sleep(1)  # Wait for page to load

# Keep clicking next until we see the last page
vc.repeat_until("last_page.png", click_next, max_attempts=5)

Utility Methods

`move_to(target, duration=0.5)`

Move mouse to coordinates or image.

vc.move_to("menu_item.png")
vc.move_to((500, 300), duration=1.0)  # Smooth movement over 1 second

`scroll(clicks, direction="down")`

Scroll the mouse wheel.

# Scroll down 5 clicks
vc.scroll(5, "down")

# Scroll up 3 clicks
vc.scroll(3, "up")

`screenshot(region=None, filename=None, log=True)`

Take a screenshot.

# Full screen screenshot
vc.screenshot("fullscreen.png")

# Region screenshot (left, top, width, height)
vc.screenshot(region=(100, 100, 300, 200), filename="region.png")

Advanced Features

Debugging

# Enable debug logging
vc.enable_logs(True, "DEBUG")

# Take a screenshot of the current state
vc.screenshot("debug_state.png")

# Get the last taken screenshot path
print(f"Last screenshot: {vc._last_screenshot}")

Error Handling

try:
    vc.click("unreliable_button.png")
except Exception as e:
    print(f"Error: {e}")
    # Screenshot is automatically taken on error when debug is enabled

Best Practices

Use context manager to ensure proper cleanup:

with VisionClick() as vc:
    # Your code here

Enable logging during development:
```
vc.enable_logs(True, "DEBUG")
```
Use relative paths for image files:
```
vc.click("images/button.png")
```

Handle timeouts appropriately:

if not vc.wait_image("loading.png", timeout=10):
    print("Operation timed out!")

Troubleshooting

Common Issues

Image not found
- Check the image path is correct
- Ensure the image is visible on screen
- Try adjusting the confidence level: vc.exists("image.png", confidence=0.7)
Slow performance
- Reduce screenshot resolution if possible
- Increase intervals between actions
- Use smaller image templates
Incorrect clicks
- Add small delays between actions
- Verify screen resolution matches your coordinates
- Check for multiple matches of the same image

License

MIT License - Feel free to use and modify as needed.

Project details

These details have not been verified by PyPI

Project links

Homepage

Development Status
- 3 - Alpha
Intended Audience
- Developers
License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language

Release history Release notifications | RSS feed

This version

0.1.0

Sep 23, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

visionclick-0.1.0.tar.gz (7.6 kB view details)

Uploaded Sep 23, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

visionclick-0.1.0-py3-none-any.whl (5.5 kB view details)

Uploaded Sep 23, 2025 Python 3

File details

Details for the file visionclick-0.1.0.tar.gz.

File metadata

Download URL: visionclick-0.1.0.tar.gz
Upload date: Sep 23, 2025
Size: 7.6 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for visionclick-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`f415517ca6aa65cb9aebeaa8f6302b53363ad2605aa55a731a84f8e271f6d88c`
MD5	`22fa84eff4543d69be3ae249ac64e2cc`
BLAKE2b-256	`c82a814f24649fda413c434bad822955a9dbd448f0ed69c9d3eef24b9c3f683e`

See more details on using hashes here.

File details

Details for the file visionclick-0.1.0-py3-none-any.whl.

File metadata

Download URL: visionclick-0.1.0-py3-none-any.whl
Upload date: Sep 23, 2025
Size: 5.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for visionclick-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`424c41bd0f3ad2a6727bbf8120887d5b7733ef67a43f5da57d90b2c401215faf`
MD5	`dc124281470c26873569caab5d13c3e8`
BLAKE2b-256	`50ef44cbe4440505d8f0ff48b11ddc8c12978d9c7f5e9f6b66e5ed12f6e1a366`

See more details on using hashes here.

visionclick 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

VisionClick

Features

🔧 Installation

🚀 Quick Start

✨ Visual Highlighting

Basic Usage

Highlight Modes

📚 Core Features

Mouse Control

Keyboard Control

Image Recognition

🛠 Advanced Usage

Conditional Actions

Debugging

📝 Examples

Basic Automation

Web Automation

🤝 Contributing

📄 License

Requirements

Basic Usage

Core Functionality

Mouse Control

click(target, button="left", clicks=1)

double_click(target, button="left")

long_click(target, duration=1.0)

drag(start, end, duration=1.0, button="left")

Keyboard Control

type(text, interval=0.1)

press(key, presses=1, interval=0.1)

Image Recognition

exists(image, confidence=0.8, log_not_found=True)

locate(image, confidence=0.8)

wait_image(image, timeout=5.0, interval=0.5, confidence=0.8)

wait_image_vanish(image, timeout=5.0, interval=0.5, confidence=0.8)

Conditional Actions

if_exists(image, action, *args, confidence=0.8, **kwargs)

repeat_until(condition_image, action, max_attempts=10, interval=1.0, confidence=0.8, *args, **kwargs)

Utility Methods

move_to(target, duration=0.5)

scroll(clicks, direction="down")

screenshot(region=None, filename=None, log=True)

Advanced Features

Debugging

Error Handling

Best Practices

Troubleshooting

Common Issues

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`click(target, button="left", clicks=1)`

`double_click(target, button="left")`

`long_click(target, duration=1.0)`

`drag(start, end, duration=1.0, button="left")`

`type(text, interval=0.1)`

`press(key, presses=1, interval=0.1)`

`exists(image, confidence=0.8, log_not_found=True)`

`locate(image, confidence=0.8)`

`wait_image(image, timeout=5.0, interval=0.5, confidence=0.8)`

`wait_image_vanish(image, timeout=5.0, interval=0.5, confidence=0.8)`

`if_exists(image, action, *args, confidence=0.8, **kwargs)`

`repeat_until(condition_image, action, max_attempts=10, interval=1.0, confidence=0.8, *args, **kwargs)`

`move_to(target, duration=0.5)`

`scroll(clicks, direction="down")`

`screenshot(region=None, filename=None, log=True)`