A cross-platform, local-only vision assistant with OCR and AI analysis
Project description
🔍 SnapVision
A cross-platform, local-only vision assistant with OCR and AI analysis.
SnapVision captures screen regions, performs OCR, and uses LLMs to analyze the content — all running locally on your machine.
✨ Features
- 🖥️ Local-only: No backend, no hosting, runs entirely on your machine
- 🌍 Cross-platform: Works on Windows, Linux (X11), and macOS
- ⌨️ Global hotkeys: Trigger capture from anywhere
- 🎯 Drag-select: Choose exactly what region to capture
- 📝 Smart OCR: Extract text from screenshots using Google Vision
- 🤖 LLM Processing: Clean and structure OCR output with Groq or OpenAI
- 💬 ChatGPT Integration: Continue conversations in your browser
📦 Installation
pip install snapvision
Requirements:
- Python 3.10 or higher
- Windows, Linux (X11), or macOS
🚀 Usage Guide
1. Setup (One-time)
Run the configuration wizard to set up your API keys:
snapvision configure
You'll be prompted to enter:
- OCR Provider:
google(Recommended) - Google Vision API Key: Get it here
- LLM Provider:
groq(Fastest) oropenai - LLM API Key: Get it from Groq or OpenAI
- Global Hotkey: The keyboard shortcut to trigger capture (Default:
Ctrl+Shift+Z)
2. Run
Start SnapVision (runs in background automatically):
snapvision start
That's it! You can close the terminal - SnapVision keeps running.
3. Capture & Analyze
- Press
Ctrl+Shift+Z(or your custom hotkey). - Drag your mouse to select an area on the screen.
- SnapVision will analyze it and show a popup with:
- 🤖 AI Summary: A concise explanation or answer.
- 📝 Extracted Text: The raw text found in the image.
- Interact:
- Click "Copy" to grab the text.
- Click "Analyze with ChatGPT" to open the topic in your browser for a deeper dive.
4. Stop
To stop the application at any time:
snapvision stop
�️ Platform Support
| Platform | Status | Notes |
|---|---|---|
| Windows | ✅ Fully Supported | Best experience |
| Linux (X11) | ✅ Supported | Works with X11 display server |
| Linux (Wayland) | ⚠️ Limited | Global hotkeys may not work |
| macOS | ✅ Supported | May need accessibility permissions |
⚠️ Known Limitations
- API Keys Required: You need your own API keys for Google Vision and Groq/OpenAI.
- Internet Required: For API calls.
- Wayland: On Linux with Wayland, use XWayland for best results.
📄 License
MIT License - see LICENSE file for details.
🤝 Contributing
Contributions are welcome! Please feel free to submit issues and pull requests.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file snapvision-0.1.0.tar.gz.
File metadata
- Download URL: snapvision-0.1.0.tar.gz
- Upload date:
- Size: 32.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e1986a8f9a9ad214cf911e45e29506153f1b481d7d2385fab157f626e0944eef
|
|
| MD5 |
9a79f2271cf5c872f07313d3945459d5
|
|
| BLAKE2b-256 |
94618653dffcc8032ea966ac0952f153e94df22ede1189ce97739187a78c1410
|
File details
Details for the file snapvision-0.1.0-py3-none-any.whl.
File metadata
- Download URL: snapvision-0.1.0-py3-none-any.whl
- Upload date:
- Size: 35.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
113749bca8654e3b88ac63cd6476acc57aaddc59113a48191ac3ddae506d7710
|
|
| MD5 |
d582c6479809e89636a15a4fd9dd2245
|
|
| BLAKE2b-256 |
3b804afe5be2e9df4fffd83be206fd10265f7c2fce3d435a6d4f9df267d5db9d
|