Open-source Python package for AI agents to interact with VNC servers
Project description
VNC Agent Bridge
Open-source Python package for AI agents to interact with VNC servers
VNC Agent Bridge provides high-level abstractions for AI agents to control mouse, keyboard, and scroll operations on remote systems via VNC (Virtual Network Computing) protocol.
✨ Features
- Mouse Control: Click, move, drag, and position tracking
- Keyboard Input: Type text, press keys, hotkeys, key combinations
- Scroll Control: Scroll up/down at specific positions
- Screenshot Capture: Save screen images in multiple formats (PNG, JPEG, BMP)
- Video Recording: Record screen activity with configurable FPS
- Clipboard Management: Get, set, and clear clipboard text
- WebSocket VNC Support: Connect to WebSocket-based VNC servers with URL templates
- Multiple Connection Types: TCP and WebSocket connections with strategy pattern
- Type Safety: 100% mypy strict compliance
- Flexible Timing: Optional delay parameters for realistic agent behavior
- Context Manager: Automatic connection management
- Enhanced Performance: Framebuffer optimization for capture features
- Optional Dependencies: numpy, Pillow, and websocket-client for advanced features
- Comprehensive Testing: 85%+ code coverage with 300+ test cases
- Zero Core Dependencies: Uses only Python standard library (for core features)
- Server-Agnostic: Works with any WebSocket VNC server (Proxmox, noVNC, custom)
🚀 Installation
From PyPI (Recommended)
pip install vnc-agent-bridge
With WebSocket Support
pip install vnc-agent-bridge[websocket]
With All Features
pip install vnc-agent-bridge[full]
With WebSocket Support
pip install vnc-agent-bridge[websocket]
From Source
git clone https://github.com/Ganzzi/vnc-agent-bridge.git
cd vnc-agent-bridge
pip install -e .
Development Installation
git clone https://github.com/Ganzzi/vnc-agent-bridge.git
cd vnc-agent-bridge
pip install -e ".[dev]"
📖 Quick Start
Basic Usage
from vnc_agent_bridge import VNCAgentBridge
# Connect to VNC server
with VNCAgentBridge('localhost', port=5900) as vnc:
# Mouse operations
vnc.mouse.left_click(100, 100)
vnc.mouse.move_to(200, 200)
vnc.mouse.drag_to(300, 300, duration=1.0)
# Keyboard operations
vnc.keyboard.type_text("Hello, World!")
vnc.keyboard.press_key('return')
vnc.keyboard.hotkey('ctrl', 'a')
# Scroll operations
vnc.scroll.scroll_up(amount=3)
vnc.scroll.scroll_down(amount=5)
# Screenshot capture
screenshot = vnc.screenshot.capture()
vnc.screenshot.save_image("screen.png")
# Video recording
vnc.video.start_recording(fps=10)
# ... perform actions ...
vnc.video.stop_recording()
# Clipboard operations
vnc.clipboard.send_text("Copied text!")
text = vnc.clipboard.get_text()
WebSocket VNC Connection
from vnc_agent_bridge import create_websocket_vnc
# Connect to WebSocket VNC server (e.g., Proxmox)
bridge = create_websocket_vnc(
url_template="wss://${host}:${port}/api2/json/nodes/pve/qemu/100/vncwebsocket?port=${vnc_port}&vncticket=${ticket}",
host="proxmox.example.com",
port=8006,
vnc_port=5900,
ticket="vncticket123"
)
with bridge:
bridge.mouse.move_to(100, 100)
bridge.keyboard.type_text("Hello WebSocket VNC!")
screenshot = bridge.screenshot.capture()
Manual Connection Management
vnc = VNCAgentBridge('192.168.1.100', username='user', password='pass')
try:
vnc.connect()
# Perform operations...
position = vnc.mouse.get_position()
print(f"Mouse at: {position}")
finally:
vnc.disconnect()
🎯 API Overview
Mouse Controller
vnc.mouse.left_click(x, y, delay=0) # Single left click
vnc.mouse.right_click(x, y, delay=0) # Right click
vnc.mouse.double_click(x, y, delay=0) # Double click
vnc.mouse.move_to(x, y, delay=0) # Move cursor
vnc.mouse.drag_to(x, y, duration=1.0, delay=0) # Drag operation
vnc.mouse.get_position() # Get current position -> (x, y)
Keyboard Controller
vnc.keyboard.type_text(text, delay=0) # Type string
vnc.keyboard.press_key(key, delay=0) # Press single key
vnc.keyboard.hotkey(*keys, delay=0) # Key combination
vnc.keyboard.keydown(key, delay=0) # Hold key down
vnc.keyboard.keyup(key, delay=0) # Release key
Scroll Controller
vnc.scroll.scroll_up(amount=3, delay=0) # Scroll up
vnc.scroll.scroll_down(amount=3, delay=0) # Scroll down
vnc.scroll.scroll_to(x, y, delay=0) # Scroll at position
Screenshot Controller
vnc.screenshot.capture(incremental=False, delay=0) # Capture screen -> np.ndarray
vnc.screenshot.save_image(path, format='PNG', delay=0) # Save to file
vnc.screenshot.capture_region(x, y, width, height, delay=0) # Region capture
Video Controller
vnc.video.start_recording(fps=10, duration=None) # Start recording
vnc.video.stop_recording() # Stop and save video
vnc.video.record_for(duration=5.0, fps=10) # Record for time period
vnc.video.get_recording_status() # Check recording state
Clipboard Controller
vnc.clipboard.send_text(text, delay=0) # Send text to clipboard
vnc.clipboard.get_text() # Get clipboard content -> str
vnc.clipboard.clear() # Clear clipboard
Connection Management
vnc = VNCAgentBridge(host, port=5900, username=None, password=None)
vnc.connect() # Connect to server
vnc.disconnect() # Disconnect from server
vnc.is_connected # Check connection status
⚙️ Configuration
Connection Parameters
host: VNC server hostname or IP addressport: VNC server port (default: 5900)username: Optional authentication usernamepassword: Optional authentication passwordtimeout: Connection timeout in seconds (default: 10.0)
Delay Parameters
All methods support an optional delay parameter:
delay=0: No delay (fast execution)delay=0.1: Quick operation (100ms)delay=0.5: Normal human-like timingdelay=1.0+: Deliberate, careful interaction
🔧 Supported Keys
Special Keys
'return','enter': Enter/Return key'tab': Tab key'escape','esc': Escape key'backspace': Backspace key'delete','del': Delete key'space': Spacebar
Function Keys
'f1'through'f12': Function keys
Arrow Keys
'up','down','left','right': Arrow keys
Modifiers
'shift': Shift key'ctrl': Control key'alt': Alt key'cmd','meta': Command/Meta key
Character Keys
- Single characters:
'a','A','1','!', etc. - Unicode characters supported
🛠️ Development
Prerequisites
- Python 3.8+
- uv (recommended) or pip
Setup Development Environment
# Clone repository
git clone https://github.com/Ganzzi/vnc-agent-bridge.git
cd vnc-agent-bridge
# Install development dependencies
uv pip install --system -e ".[dev]"
# Run tests
pytest
# Type checking
mypy vnc_agent_bridge --strict
# Linting
flake8 vnc_agent_bridge tests
# Formatting
black vnc_agent_bridge tests
Testing Strategy
- Mock-based testing: No real VNC server required
- 85%+ coverage target: Comprehensive test suite with 130+ test cases
- Type safety: 100% mypy strict compliance
- Cross-platform: Tests run on Linux, macOS, Windows
Current Project Status
- ✅ v0.1.0: Core functionality released on PyPI
- ✅ v0.2.0: Stable release with capture features on PyPI and GitHub
- ✅ v0.3.0: WebSocket VNC support and modular architecture on PyPI and GitHub
- ✅ v0.3.0 Fixes: Screenshot format errors fixed, WebSocket authentication corrected
- 🎯 v0.3.0 Features: WebSocket connections, URL templates, server-agnostic design
- Next Milestone: Community feedback and future enhancements
Quality Metrics
- Test Coverage: 85% (391 statements, 59 missed)
- Type Checking: 100% mypy strict compliance (0 errors)
- Linting: 0 flake8 errors
- Formatting: 100% black compliant
- Test Cases: 303 total (303 passing, 100% pass rate)
📚 Documentation
🤝 Contributing
We welcome contributions! Please see our Contributing Guide for details.
Quick Contribution Steps
- Fork the repository
- Create a feature branch:
git checkout -b feature/amazing-feature - Make your changes with tests
- Run the test suite:
pytest - Submit a pull request
📄 License
This project is licensed under the MIT License - see the LICENSE file for details.
🙏 Acknowledgments
- Built with Ganzzi
- Inspired by the need for reliable AI agent automation
- Thanks to the VNC and RFB protocol specifications
📞 Support
- Issues: GitHub Issues
- Discussions: GitHub Discussions
- Documentation: Full Docs
Made with ❤️ by Ganzzi
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file vnc_agent_bridge-0.3.0.tar.gz.
File metadata
- Download URL: vnc_agent_bridge-0.3.0.tar.gz
- Upload date:
- Size: 182.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8407cd4af29626e02843022bbf97fe8b23c8b761ae5367fe9d9239a76a2a6373
|
|
| MD5 |
b8fce67faef2ac229d0ab09269dcba18
|
|
| BLAKE2b-256 |
e81bd6fe7b317a0959cf29ac9a0092840de97725a69ed1c80c268410400f77c0
|
File details
Details for the file vnc_agent_bridge-0.3.0-py3-none-any.whl.
File metadata
- Download URL: vnc_agent_bridge-0.3.0-py3-none-any.whl
- Upload date:
- Size: 42.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
59e0185aae0acd136571c0d48e6ef8e74860cf78c48e269316ba4c90973815fc
|
|
| MD5 |
b7778ea882d57ee1b1b1cd758b4aadbd
|
|
| BLAKE2b-256 |
5cb7341733e64130aa92a80d35f2baf5564eefc2326f19a8774b912e64f132f6
|