Windows desktop automation engine — see, click, type, automate.
Project description
Naturo — Desktop Automation Engine (Eyes + Hands for AI Agents)
See, click, type, capture. Desktop automation core only.
What You Get
- 🖥️ Screen Capture — Screenshot any window or monitor
- 🌳 UI Tree Inspection — Walk the accessibility tree (UIA / MSAA / IAccessible2 / Java Access Bridge)
- 🔍 Element Finding — CSS-like selectors + fuzzy search for UI elements
- 🖱️ Click & Type — Hardware-level input simulation
- ⌨️ Key Combos — Send any keystroke or shortcut
- 🎮 Hardware Keyboard — Scan-code input bypasses virtual-key detection (games, anti-cheat)
- 📸 Annotated Screenshots — AI-ready screenshots with numbered bounding boxes
- 📋 Menu Traversal — Extract app menu structures with shortcuts
- 🪟 Window Management — Focus, close, minimize, maximize, move, resize windows
- 📦 App Control — Launch, quit, switch, hide/unhide applications
- 💬 Dialog Handling — Detect and interact with system dialogs (message boxes, file pickers)
- 📌 Taskbar & Tray — List and click taskbar items and system tray icons
- 🖥️ Multi-Monitor — Enumerate monitors, capture specific screens, DPI-aware coordinates
- 🗂️ Virtual Desktops — List, switch, create, close desktops and move windows between them
- 🍎 macOS Support — Coming soon (native implementation in development)
- 🔬 Cascade Recognition — UIA + CDP + AI Vision multi-source fusion for Electron/CEF apps where single-source fails
- 🤖 AI-Ready — JSON output, agent-friendly CLI, MCP server
Platform Support
| Platform | Status | Notes |
|---|---|---|
| Windows 10/11 | ✅ Full support | Primary platform. All features available. |
| Windows 7 SP1+ | ⚠️ Best-effort | Basic features only, no UIAutomation v3. |
| macOS 13+ | 🚧 Coming soon | Native support is under active development. |
| Linux | 🚧 Coming soon | Backend is a placeholder. Not usable yet. |
| Python | 3.9+ | Required for all platforms. |
Why Windows 10+? UIAutomation v2/v3 APIs (caching, virtualized controls) require Windows 8+. Windows 7 has been out of support since January 2020. Most enterprise customers have migrated to Windows 10/11.
Install
pip install naturo
MCP Server Setup
Naturo includes a built-in MCP server with 60+ tools for AI agent integration.
Claude Desktop / Claude Code
Add to your Claude configuration file (claude_desktop_config.json):
{
"mcpServers": {
"naturo": {
"command": "naturo",
"args": ["mcp", "start"]
}
}
}
Config file location:
- Windows:
%APPDATA%\Claude\claude_desktop_config.json- macOS:
~/Library/Application Support/Claude/claude_desktop_config.json
Other AI Agents (SSE / HTTP)
For agents that connect over HTTP instead of stdio:
# SSE transport (Server-Sent Events)
naturo mcp start --transport sse --port 3100
# Streamable HTTP transport
naturo mcp start --transport streamable-http --port 3100
Verify Setup
# List all 60+ MCP tools
naturo mcp tools
# Install MCP dependencies if needed
naturo mcp install
Quick Start
# Check version
naturo --version
# Capture a screenshot
naturo capture --path screen.png
# List open windows
naturo list windows
# Inspect UI tree
naturo see --window "Notepad" --depth 5
# Click an element
naturo click "Button:Save"
# Type text
naturo type "Hello, World!"
# Type with hardware scan codes (bypass anti-cheat detection)
naturo type "Hello" --input-mode hardware
# Press key combo
naturo press ctrl+s
# Find element
naturo find "Edit:filename"
# App management
naturo app launch "notepad"
naturo app focus "notepad"
naturo app quit "chrome" --force
naturo app minimize "notepad"
naturo app restore "notepad"
naturo app inspect "notepad" # Probe frameworks (UIA, CDP, MSAA...)
naturo app relaunch "notepad"
# Dialog handling
naturo dialog detect # Detect active dialogs
naturo dialog accept # Click OK/Yes
naturo dialog dismiss # Click Cancel/No
naturo dialog type "hello.txt" --accept # Type filename then OK
# Taskbar & tray
naturo taskbar list # List taskbar items
naturo taskbar click "Chrome" # Click taskbar button
naturo tray list # List tray icons
naturo tray click "Volume" # Left-click tray icon
naturo tray click "Wi-Fi" --right # Right-click for menu
# Virtual desktops (Windows 10/11)
naturo desktop list # List virtual desktops
naturo desktop switch 1 # Switch to desktop 1
naturo desktop create --name "Work" # Create named desktop
naturo desktop close # Close current desktop
naturo desktop move-window 1 --app "Notepad" # Move window to desktop 1
# Type Windows paths literally (--raw disables escape interpretation)
naturo type "C:\Users\test\report.txt" --raw --app notepad
# Paste text via clipboard (fast for large content)
naturo type "large content" --paste # Set clipboard → Ctrl+V → restore
naturo type --paste --file data.txt # Read file → paste
# Read element values
naturo get e47 # Read text/value by element ref
naturo get --aid txtSearch # Read by AutomationId
naturo get --role Edit --name Search # Read by role + name
naturo get --role Button --app notepad --all -j # All buttons (JSON)
# Write element values
naturo set e47 "hello world" # Set text field value
naturo set --aid txtSearch "query" # Set by AutomationId
naturo set e12 --toggle # Toggle a checkbox
naturo set e8 --select # Select a list/radio item
naturo set e5 --expand # Expand a combo box
# Highlight UI elements
naturo highlight --app notepad # Show actionable elements
naturo highlight --app notepad --all # Show all elements
naturo highlight e11 --app notepad # Highlight specific ref
naturo highlight --app notepad -A out.png # Save annotated screenshot
Cascade Recognition
Most desktop automation tools rely on a single accessibility API (UIA) — when it fails (Electron apps, custom-rendered UI), you're stuck. Naturo cascades through multiple recognition sources automatically:
UIA → CDP → AI Vision
↓ ↓ ↓
Win32 Chrome Claude/GPT
native DevTools screenshot
# Progressive multi-source recognition
naturo see --app feishu --cascade --fill-gaps --stats
# Result: UIA finds 700+ elements, AI Vision adds 130+ that UIA missed
# uia 705 elements 6s [ok]
# cdp 0 elements 15s [skipped]
# vision 133 elements 72s [ok]
# Click an AI-discovered element by ref
naturo click e805 --app feishu # "视频会议" found by AI Vision
How it works:
- UIA/MSAA finds native Win32/WPF/UWP controls (fastest, most accurate)
- CDP reaches into Electron/Chrome web content via DevTools Protocol
- AI Vision screenshots the window and asks Claude/GPT to enumerate every visible element — catches anything the other sources miss
- IoU dedup prevents duplicates: if UIA already found an element, AI Vision skips it
- Tree merge attaches AI-discovered elements to the correct UIA parent container
Requires ANTHROPIC_API_KEY or OPENAI_API_KEY for AI Vision. Set NATURO_AI_MODEL to choose the model (default: claude-sonnet-4-20250514).
CLI Commands
See (observe the desktop)
| Command | Description | Since |
|---|---|---|
capture |
Screenshot screen/window | 0.1.0 |
see |
Inspect UI element tree | 0.1.0 |
find |
Search UI elements (fuzzy match) | 0.1.0 |
get |
Read element properties (text, value, state) | 0.2.1 |
highlight |
Visual overlay showing all actionable elements | 0.3.0 |
list windows |
List open windows | 0.1.0 |
list apps |
List running applications | 0.1.0 |
list screens |
List monitors and resolutions | 0.1.0 |
diff |
Compare two UI snapshots | 0.1.1 |
menu-inspect |
List app menu structure with shortcuts | 0.1.0 |
Act (interact with the desktop)
| Command | Description | Since |
|---|---|---|
click |
Click element/coordinates (--paste, --copy, --cut modifiers) |
0.1.0 |
type |
Type text (supports --paste for clipboard) |
0.1.0 |
set |
Set element value/state (toggle, select, expand) | 0.3.0 |
press |
Press key combination (e.g., ctrl+s) |
0.1.0 |
scroll |
Scroll mouse wheel | 0.1.0 |
drag |
Drag from/to coordinates | 0.1.0 |
move |
Move mouse cursor | 0.1.0 |
wait |
Wait for element/window to appear | 0.1.0 |
App management
| Command | Description | Since |
|---|---|---|
app launch |
Launch application by name or path | 0.1.0 |
app quit |
Quit application (supports --force) |
0.1.0 |
app focus |
Focus an application window (alias: app switch) |
0.1.0 |
app close |
Close an application window (graceful or forced) | 0.1.0 |
app minimize |
Minimize an application window (alias: app hide) |
0.1.0 |
app maximize |
Maximize an application window | 0.1.0 |
app restore |
Restore a minimized/maximized window (alias: app unhide) |
0.1.0 |
app move |
Move and/or resize an application window | 0.1.0 |
app list |
List running applications with visible windows | 0.1.0 |
app windows |
List open windows (filter by app/PID) | 0.1.0 |
app find |
Find application by name or PID | 0.1.0 |
app inspect |
Probe app frameworks (UIA, CDP, MSAA...) | 0.3.0 |
app relaunch |
Quit and relaunch an application | 0.3.0 |
System
| Command | Description | Since |
|---|---|---|
clipboard get |
Read clipboard text content | 0.3.1 |
clipboard set |
Write text to clipboard | 0.3.1 |
clipboard clear |
Clear clipboard contents | 0.3.1 |
clipboard info |
Show clipboard format and size | 0.3.1 |
dialog detect |
Detect active system dialogs | 0.1.0 |
dialog accept |
Accept (OK/Yes) a dialog | 0.1.0 |
dialog dismiss |
Dismiss (Cancel/No) a dialog | 0.1.0 |
dialog click-button |
Click specific dialog button | 0.1.0 |
dialog type |
Type in dialog input field | 0.1.0 |
taskbar list |
List taskbar items | 0.1.0 |
taskbar click |
Click taskbar item | 0.1.0 |
tray list |
List system tray icons | 0.1.0 |
tray click |
Click tray icon (left/right/double) | 0.1.0 |
desktop list |
List virtual desktops | 0.1.0 |
desktop switch |
Switch to a virtual desktop | 0.1.0 |
desktop create |
Create a new virtual desktop | 0.1.0 |
desktop close |
Close a virtual desktop | 0.1.0 |
desktop move-window |
Move window to another desktop | 0.1.0 |
Tools
| Command | Description | Since |
|---|---|---|
snapshot list |
List stored snapshots | 0.1.0 |
snapshot sessions |
List all snapshot sessions | 0.1.0 |
snapshot clean |
Remove old snapshots | 0.1.0 |
mcp start |
Start MCP server | 0.1.0 |
mcp install |
Install MCP server configuration | 0.3.0 |
mcp tools |
List available MCP tools | 0.3.0 |
config |
View/set naturo configuration | 0.3.0 |
excel open |
Open Excel workbook (Windows only) | 0.1.1 |
excel read |
Read cells from worksheet | 0.1.1 |
excel write |
Write values to cells | 0.1.1 |
excel list-sheets |
List worksheets in workbook | 0.1.1 |
excel run-macro |
Execute VBA macro | 0.1.1 |
excel info |
Show workbook metadata | 0.1.1 |
Deprecated:
window *commands still work but print a deprecation warning. Useapp *equivalents instead.hotkeyis deprecated in favor ofpress.
Snapshot System
Every see and capture call automatically persists a snapshot — a
directory under ~/.naturo/snapshots/ containing the screenshot and full UI
element map.
# List all snapshots
naturo snapshot list
# Remove snapshots older than 7 days
naturo snapshot clean --days 7
# Remove all snapshots
naturo snapshot clean --all --yes
Snapshots expire after 10 minutes when queried via get_most_recent_snapshot,
mirroring Peekaboo's validity window.
Architecture
┌─────────────┐
│ AI Agent │ Python SDK / MCP Server
├─────────────┤
│ CLI (click) │ naturo CLI
├─────────────┤
│ Snapshot │ naturo/snapshot.py + naturo/models/snapshot.py
├─────────────┤
│ Python │ ctypes bridge
├─────────────┤
│ C API │ exports.h
├─────────────┤
│ C++ Core │ UIA, MSAA, IA2, JAB, Win32, DirectX
└─────────────┘
See docs/ARCHITECTURE.md for details.
Comparison
| Feature | naturo | PyAutoGUI | pywinauto | AutoIt | WinAppDriver |
|---|---|---|---|---|---|
| MCP Server | ✅ Built-in | ❌ | ❌ | ❌ | ❌ |
| AI Agent Ready | ✅ JSON output, agent CLI | ❌ | ❌ | ❌ | ❌ |
| UI Frameworks | UIA + MSAA + IA2 + JAB + CDP + AI Vision | None (image only) | UIA, Win32 | Win32 messages | UIA only |
| Cascade Recognition | ✅ Multi-source fusion with auto-dedup | ❌ | ❌ | ❌ | ❌ |
| Auto-Detection | ✅ Picks best framework per app | N/A | Manual backend choice | N/A | N/A |
| Element Tree | ✅ Full hierarchy | ❌ | ✅ | ❌ | ✅ |
| Post-Action Verify | ✅ Confirms actions took effect | ❌ | ❌ | ❌ | ❌ |
| Hardware Keyboard | ✅ Scan codes (anti-cheat safe) | ❌ | ❌ | ✅ | ❌ |
| Image Matching | Via AI vision | ✅ Built-in | ❌ | ✅ | ❌ |
| Screen Capture | ✅ DirectX / GDI | ✅ | ❌ | ✅ | ❌ |
| Cross-Platform | Windows + macOS | Win / Mac / Linux | Windows (+ Linux partial) | Windows only | Windows only |
| Language | Python + C++ core | Python | Python | Custom script | C# / WebDriver |
| Maintained | ✅ Active | ✅ Active | ⚠️ Slow | ⚠️ Slow | ❌ Deprecated |
vs Peekaboo (macOS)
Naturo aims to provide cross-platform desktop automation. On macOS, native support is coming soon.
| Peekaboo (macOS) | Naturo (Windows) | Naturo (macOS - planned) | |
|---|---|---|---|
| UI Framework | Accessibility API | UIA + MSAA + IA2 + JAB | Accessibility API |
| Screen Capture | ScreenCaptureKit | DirectX / GDI | ScreenCaptureKit |
| Input | CGEvent | SendInput + Phys32 scan codes | CGEvent |
| Language | Swift | C++ | C++ / Python bridge |
| Status | Available | Available | Coming soon |
Contributing
We welcome bug reports and testing help!
- 🐛 Report bugs: GitHub Issues
- 🧪 Testing guide: See External Tester Guide
- 📖 Contributing guide: See CONTRIBUTING.md
License
MIT — see LICENSE
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file naturo-0.3.1.tar.gz.
File metadata
- Download URL: naturo-0.3.1.tar.gz
- Upload date:
- Size: 705.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
57da88f23d8e8c72ef643f10a8261c7baaa1f63531e7e87351c6cf48641258d6
|
|
| MD5 |
6a53921e94d406d00d41fbc57785dd3b
|
|
| BLAKE2b-256 |
085eedffc677f1bfb58249ab13c91a9dde5282ff6d0f0e1d52715bea618f5155
|
Provenance
The following attestation bundles were made for naturo-0.3.1.tar.gz:
Publisher:
publish.yml on AcePeak/naturo
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
naturo-0.3.1.tar.gz -
Subject digest:
57da88f23d8e8c72ef643f10a8261c7baaa1f63531e7e87351c6cf48641258d6 - Sigstore transparency entry: 1203475376
- Sigstore integration time:
-
Permalink:
AcePeak/naturo@b25edb240044eb983b679f0c1e2add6d4a7550e0 -
Branch / Tag:
refs/tags/v0.3.1 - Owner: https://github.com/AcePeak
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@b25edb240044eb983b679f0c1e2add6d4a7550e0 -
Trigger Event:
release
-
Statement type:
File details
Details for the file naturo-0.3.1-py3-none-any.whl.
File metadata
- Download URL: naturo-0.3.1-py3-none-any.whl
- Upload date:
- Size: 420.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
33df0a795bd697fef7b87374fabd6958cc0595b334fb7044b43a88d13d7bd066
|
|
| MD5 |
b8dc693b9a102f9818f958e7f1dfca60
|
|
| BLAKE2b-256 |
afee6bedb849e245deffd64df43186a93595b21fe884e7b81bbcd5eb5c73ef2e
|
Provenance
The following attestation bundles were made for naturo-0.3.1-py3-none-any.whl:
Publisher:
publish.yml on AcePeak/naturo
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
naturo-0.3.1-py3-none-any.whl -
Subject digest:
33df0a795bd697fef7b87374fabd6958cc0595b334fb7044b43a88d13d7bd066 - Sigstore transparency entry: 1203475377
- Sigstore integration time:
-
Permalink:
AcePeak/naturo@b25edb240044eb983b679f0c1e2add6d4a7550e0 -
Branch / Tag:
refs/tags/v0.3.1 - Owner: https://github.com/AcePeak
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@b25edb240044eb983b679f0c1e2add6d4a7550e0 -
Trigger Event:
release
-
Statement type: