Skip to main content

Like Selenium, but for your entire screen. Find and interact with text anywhere using OCR.

Project description

🎯 Screenium

Screenium is like Selenium after discovering there's a whole world outside the browser. Find and click any text on your screen using OCR. That "Accept Cookies" popup for the millionth time? Banish it. That stubborn system dialog that won't go away? Click it. That game that needs grinding? Automate it.

Demo

🔍 Use Cases

  • Auto-Accept Cookies: Run a background task that automatically clicks "Accept Cookies" popups
  • Meeting Assistant: Auto-join video calls by detecting "Join Meeting" buttons and automatically enable/disable your camera/mic based on meeting names
  • Game Automation: Automate repetitive game tasks by detecting and clicking on in-game text (e.g., "Collect Rewards", "Start Battle")
  • System Dialog Monitor: Handling system or application dialogs, like "Are you sure?" or "Run command," in Cursor based on set rules.
  • Multi-Browser Testing: Perform UI tests on browsers that lack automation APIs or web drivers with no separate web driver installations needed.

🔍 Finding Text

The most basic way to find text is:

from screenium import Text

# Simple text matching
text = Text("Login")  # Finds "Login" anywhere on screen
text = Text("LOG")    # Matches "LOGIN", "LOGOUT", etc.

# Exact matching
text = Text("Login", exact_match=True)  # Only matches "Login" exactly
text = Text("login", case_sensitive=True)  # Case-sensitive matching

📍 Spatial Relationships

You can find text based on its position relative to other text:

# Basic relationships
Text("Password").below("Username")  # Find "Password" below "Username"
Text("Cancel").left_of("Submit")    # Find "Cancel" left of "Submit"
Text("Help").right_of("Back")       # Find "Help" right of "Back"
Text("Title").above("Content")      # Find "Title" above "Content"

# Chain relationships
Text("Save").below("Options").right_of("Cancel")

📐 Aligned Elements

For more precise positioning, use aligned to require elements to line up:

# Elements must be vertically aligned (same x-axis)
Text("Email").aligned.below("Username")

# Elements must be horizontally aligned (same y-axis)
Text("Back").aligned.left_of("Next")

🎨 Background Color Matching

Find text by its background color:

# Match text with specific background colors
Text("Error", background_color="red")
Text("Success", background_color="green")
Text("Info", background_color="#0088ff")

# Adjust color matching tolerance (0-100)
Text("Warning", background_color="yellow", background_tolerance=50)

🖱️ Mouse Actions

Interact with matched text:

# Basic mouse actions
Text("Submit").click()
Text("Options").right_click()
Text("Link").double_click()
Text("Button").mouse_move()  # Just move mouse without clicking

# Type text
Text("Username").click().typewrite("myuser")
Text("Password").click().typewrite("mypass")

# Special keys
Text("Terminal").click().typewrite(["command", "k"])  # Clear terminal

⏰ Waiting and Monitoring

Handle timing and background monitoring:

# Wait for specific duration
Text("Loading").wait(2).click()  # Wait 2 seconds then click

🔍 Inspecting Matches

Get detailed information about matches:

# Get all matches
matches = Text("Button").matches
for match in matches:
    print(f"Found at: ({match['x']}, {match['y']})")
    print(f"Size: {match['width']}x{match['height']}")
    print(f"Confidence: {match['confidence']}")

🎨 Visual Debugging

Draw boxes around matches to debug:

# Draw green box for 3 seconds
Text("Username").draw(duration=3, color="green")

# Chain with other operations
Text("Password").below("Username").draw(duration=2, color="red").click()

🎯 Tips & Best Practices

  1. Start with basic text matching before adding spatial relationships
  2. Use draw() to visually verify matches
  3. Adjust background color tolerance if color matching is too strict/loose
  4. Use aligned when elements should line up precisely
  5. Chain operations for more precise matching

⚠️ Limitations

  • macOS only (uses Apple Vision framework)
  • Requires macOS Monterey (12.x) or higher
  • Screen resolution and scaling can affect matching

🔐 Required macOS Permissions

Screenium needs two key macOS permissions to function. The app running your Python script (e.g., Terminal, VS Code, PyCharm) will need:

  1. Screen Recording

    • Open System Settings > Privacy & Security > Screen Recording
    • Toggle on your terminal app/IDE
    • Required for:
      • Finding text on screen using OCR
      • Background monitoring
      • Visual debugging with draw()
  2. Accessibility

    • Open System Settings > Privacy & Security > Accessibility
    • Toggle on your terminal app/IDE
    • Required for:
      • Mouse actions (click(), right_click(), etc.)
      • Keyboard input (typewrite(), hotkey())

💡 Tip: If running from different apps, each one needs separate permissions. For example, if you run scripts from both Terminal and VS Code, you'll need to grant permissions to both.


Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

screenium-0.1.0.tar.gz (54.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

screenium-0.1.0-py3-none-any.whl (17.7 kB view details)

Uploaded Python 3

File details

Details for the file screenium-0.1.0.tar.gz.

File metadata

  • Download URL: screenium-0.1.0.tar.gz
  • Upload date:
  • Size: 54.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.5.1

File hashes

Hashes for screenium-0.1.0.tar.gz
Algorithm Hash digest
SHA256 ad99bd96c2fa880d004a4da0d77ade7e5a0fe43f1611f7ea1f22828fbc39ab8c
MD5 e3bbcddad895b738e8149de82537c11d
BLAKE2b-256 192938a4d639a97f500a253cf880e5a931552eb9f4116878d48d213f250fc6fb

See more details on using hashes here.

File details

Details for the file screenium-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: screenium-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 17.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.5.1

File hashes

Hashes for screenium-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 a39b47c52a5ed0e93e67399e11d344e49e702a62c37f0132bfd6e74870c1f5ed
MD5 249afb5857daed234b1e2a6f28f19ff2
BLAKE2b-256 2ee10ec0a52ecd0e1f00b3c0ca8029b714129d022d71201e639b437d7e57488a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page