A gesture-based screen capture and keyboard shortcut automation system using MediaPipe and FastAPI
Project description
| Branch | Version | Status |
|---|---|---|
master |
0.3.0 |
| Platform | Python | System Dependencies |
|---|---|---|
| Linux (X11/Wayland) | 3.12+ |
libgl1, libglib2.0-0, notify-send, xdotool, ydotool |
| macOS (ARM64) | N/A |
N/A |
| Windows (x86_64) | N/A |
N/A |
- Docs: [Link to documentation]
Introduction
Gesture Vision is a gesture control system, developed in Python to automate desktop actions through real-time gesture recognition.
The system works through a computer vision pipeline where MediaPipe processes the camera stream to detect hand landmarks, coordinated by an asynchronous FastAPI backend. This provides precise event control with minimal impact on system resources.
Architecture
Main Features
- High Precision Tracking: MediaPipe Hand Landmarker implementation for low-latency hand landmark detection.
- Robust Trigger Control: Validation system based on gesture hold time and cooldown periods to eliminate false positives.
- Swipe Gestures: Workspace switching with horizontal hand movements - left/right.
- Screenshot Capture: Detection of the "peace" gesture (index and middle fingers up) to take screenshots.
- Invisible Mode (Background): Fully background operation without preview windows, optimizing GPU/CPU usage.
- Keyboard Shortcuts: Associate gestures with system shortcuts like ctrl+shift+q to execute any action.
- X11 and Wayland Support: Automatically detects whether to use xdotool (X11) or ydotool (Wayland).
- Native Notifications: Integration with desktop notification system to instantly confirm actions.
- Native Deployment: Simplified installation via
pipwith dedicated CLI commands for service management. - Web Panel: Graphical interface to configure gestures, shortcuts, and timing.
Quick Start
1. Installation
# Install from PyPI
pip install gesturevision
2. Run
# First time: automatically install system dependencies
gesturevision-start --install-deps
# Start the system
gesturevision-start
4. Control Panel
Access the web panel: http://localhost:8080
There you can:
- Start/stop the system
- Configure gesture and cooldown timing
- Add gestures and associate keyboard shortcuts
Available Gesture Types
| Gesture | Description |
|---|---|
| ✌️ Peace | Index and middle fingers up |
| ✊ Fist | Closed hand |
| ✋ Open Hand | 5 fingers extended |
| 👍 Three Fingers | Index, middle, and ring fingers up |
| ☝️ Point | Only index finger up |
Supported Keyboard Shortcuts
You can use any of the predefined shortcuts or write your own:
- Screenshot:
Print - Close window:
alt+F4 - Minimize all:
super+d - Lock screen:
super+l - Switch window:
alt+Tab - New tab:
ctrl+t - Close tab:
ctrl+w - Copy/Paste:
ctrl+c/ctrl+v - And many more...
Project Structure
src/gesturevision/main_vision.py: Gesture detection and capture engine.src/gesturevision/api/: FastAPI backend for control and configuration.src/gesturevision/static/: Control panel user interface.pyproject.toml: Package dependencies and entry points definition.gesturevision/assets/hand_landmarker.task: Pre-trained AI model.
Configuration
Settings are stored in: ~/gesturevision/config.json
Structure:
{
"gesture_hold_seconds": 1.0,
"cooldown_seconds": 1.5,
"gestures": [
{
"name": "Peace (screenshot)",
"gesture_type": "peace_sign",
"required_hold_seconds": 1.0,
"enabled": true,
"action": "screenshot",
"shortcut": null
}
]
}
Docker (Optional)
docker-compose up --build
Access the panel at: http://localhost:8080
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file gesturevision-0.3.1.tar.gz.
File metadata
- Download URL: gesturevision-0.3.1.tar.gz
- Upload date:
- Size: 5.8 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.1.dev32+gd465cb093 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ec7938ec808ff0f734837ece140bae5a14323b445ad28ae082d1a8397d2ea350
|
|
| MD5 |
a298fceef26eebaabc48f988ac14565b
|
|
| BLAKE2b-256 |
1d36862a91842db14921cec94b80b5692f51ef7d25b0184b2a948b0d73ff59c9
|
File details
Details for the file gesturevision-0.3.1-py3-none-any.whl.
File metadata
- Download URL: gesturevision-0.3.1-py3-none-any.whl
- Upload date:
- Size: 5.9 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.1.dev32+gd465cb093 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
34a10640c68f2517d63f8f11e559687afb58ec2db92d8eddca3e1906ffbf4ad5
|
|
| MD5 |
fccb4d2e240973f48e7a922287c84665
|
|
| BLAKE2b-256 |
962d67aa1fde2702067fe86b64207c0e95ab5155f37567fb24b71a12cf71fe05
|