Screenshot OCR tool using Gemini AI API
Project description
Snap2Text
Screenshot OCR tool using Google Gemini AI - extracts text from screenshots directly to clipboard.
Description
Snap2Text is a lightweight command-line tool that allows you to capture any area of your screen and instantly extract text from it using Google's Gemini AI. The extracted text is automatically copied to your clipboard, ready to be pasted wherever you need it.
Features
- Select screen area and get text in clipboard - Draw a selection rectangle around any text on your screen
- Auto-detects available system tools - Works out of the box with whatever tools you have installed
- Works with multiple screenshot tools - Supports maim, scrot, and gnome-screenshot
- Multiple clipboard tools supported - Compatible with xclip, xsel, and wl-paste
- Supports notification tools - Optional notifications via notify-send or kdialog
Model Information
This package uses gemini-2.5-flash-lite as the hardcoded model. This model was chosen for its optimal balance of speed, cost-efficiency, and OCR accuracy for screenshot text extraction tasks.
Changing the Model (Advanced Users)
To use a different Gemini model, edit the source code:
# Locate the package installation
python -c "import snap2text; print(snap2text.__file__)"
# Edit the __main__.py file and change the MODEL variable
# Default: MODEL = "gemini-2.5-flash-lite"
# Alternatives: "gemini-2.5-flash", "gemini-2.0-flash-lite", etc.
Available models can be found at: https://ai.google.dev/gemini-api/docs/models/gemini
Requirements
- Python 3.8+
- At least one screenshot tool: maim, scrot, or gnome-screenshot
- At least one clipboard tool: xclip, xsel, or wl-paste
- Optional: notify-send or kdialog for notifications
API Key Setup
This package is designed to work with Google AI Studio API keys.
Free Tier Limits
- 250 requests per day
- 10 requests per minute
- Results are more than adequate for screenshot OCR tasks
Getting Your API Key
- Visit Google AI Studio API Keys
- Sign in with your Google account
- Click "Create API Key"
- Copy the generated key
Installation
pip install snap2text
Configuration
Set your Google AI Studio API key using one of the following methods:
Method 1: Environment Variable
export MY_GEMINI_API_KEY="your-api-key-here"
Add this line to your ~/.bashrc, ~/.zshrc, or equivalent shell configuration file to make it persistent.
Method 2: Secrets File
Create ~/.secrets/MY_GEMINI_API_KEY containing only your API key:
mkdir -p ~/.secrets
echo "your-api-key-here" > ~/.secrets/MY_GEMINI_API_KEY
chmod 600 ~/.secrets/MY_GEMINI_API_KEY
Usage
Command Line
Run the command directly:
snap2text
Then select the area of your screen containing the text you want to extract. The text will be automatically copied to your clipboard.
Keyboard Shortcut (Recommended)
The recommended way to use Snap2Text is by binding it to a keyboard shortcut:
- Recommended shortcut:
Ctrl+PrintScreen(most logical for screenshot tools) - Alternative:
Super+Shift+Sor any combination you prefer
Setting up a keyboard shortcut:
Linux Mint / Cinnamon:
- Open System Settings → Keyboard → Shortcuts → Custom Shortcuts
- Click "Add custom shortcut"
- Name: "Snap2Text"
- Command:
snap2text - Assign your preferred key combination
Ubuntu / GNOME:
- Open Settings → Keyboard → Custom Shortcuts
- Click "+" to add new shortcut
- Name: "Snap2Text"
- Command:
snap2text - Set your preferred keybinding
KDE Plasma:
- Open System Settings → Shortcuts → Custom Shortcuts
- Edit → New → Global Shortcut → Command/URL
- Name: "Snap2Text"
- Command:
snap2text - Assign your preferred trigger
System Dependencies
Install the required screenshot and clipboard tools for your distribution:
Ubuntu / Debian
sudo apt install maim xclip
Arch Linux
sudo pacman -S maim xclip
Fedora
sudo dnf install maim xclip
Optional Dependencies
For desktop notifications when text is copied:
# Ubuntu / Debian
sudo apt install libnotify-bin
# Arch Linux
sudo pacman -S libnotify
# Fedora
sudo dnf install libnotify
For KDE users who prefer kdialog:
# Ubuntu / Debian
sudo apt install kdialog
# Arch Linux
sudo pacman -S kdialog
# Fedora
sudo dnf install kdialog
Alternative Screenshot Tools
If you prefer other screenshot tools, Snap2Text will automatically detect and use them:
# Using scrot (Ubuntu/Debian)
sudo apt install scrot
# Using gnome-screenshot (Ubuntu/Debian)
sudo apt install gnome-screenshot
Alternative Clipboard Tools
Similarly, you can use alternative clipboard tools:
# Using xsel instead of xclip (Ubuntu/Debian)
sudo apt install xsel
# For Wayland users (Ubuntu/Debian)
sudo apt install wl-clipboard
Legal Disclaimer
This software is provided "AS IS" without warranty of any kind, express or implied.
By using Snap2Text, you acknowledge and agree to the following:
-
API Usage: You are solely responsible for your Google AI Studio API usage, including but not limited to:
- Staying within free tier limits
- Monitoring your API consumption
- Any charges incurred from exceeding free tier limits or using paid plans
- Compliance with Google's Terms of Service and API usage policies
-
No Liability: The authors and contributors of this package assume no responsibility for:
- API misuse or abuse
- Costs or charges from Google AI Studio
- Data privacy or security issues related to API usage
- Any damages arising from the use of this software
-
User Responsibility: It is your responsibility to:
- Understand Google's AI Studio pricing and limits
- Monitor your API usage
- Secure your API key appropriately
- Use the tool in compliance with all applicable laws and regulations
For the most current information about API limits and pricing, visit: https://ai.google.dev/pricing
General Disclaimer
The authors and contributors of this project are not responsible for any damage, data loss, or any other issues that may arise from the use of this software. Use it at your own risk. This software is provided under the MIT License without any warranty of any kind, express or implied. See the LICENSE file for full details.
⚠️ Beta Version Notice
This is a beta release (v0.2.0b1). This version may contain incomplete implementations and significant undocumented bugs. It is strongly recommended to avoid using this software in a real production or development environment. Features may change, break, or be removed without prior notice in future releases.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file snap2text-0.2.0b1.tar.gz.
File metadata
- Download URL: snap2text-0.2.0b1.tar.gz
- Upload date:
- Size: 7.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
078fce06f29b6c3f1096337801fe9bb6c45c7bb0ade822b7f0cba42c52597318
|
|
| MD5 |
85118f8447244ca678e8ca3fdbd913cb
|
|
| BLAKE2b-256 |
e18a69c6e1122d2b588e25b4613132fd2ae02a094bf3b47358d6d369a7e9f81b
|
Provenance
The following attestation bundles were made for snap2text-0.2.0b1.tar.gz:
Publisher:
pypi-publish.yml on marcbadiam/snap2text
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
snap2text-0.2.0b1.tar.gz -
Subject digest:
078fce06f29b6c3f1096337801fe9bb6c45c7bb0ade822b7f0cba42c52597318 - Sigstore transparency entry: 1065696852
- Sigstore integration time:
-
Permalink:
marcbadiam/snap2text@e8948d40ef33159d868efa23c9b11ed1e038c2ae -
Branch / Tag:
refs/tags/v0.2.0b1 - Owner: https://github.com/marcbadiam
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
pypi-publish.yml@e8948d40ef33159d868efa23c9b11ed1e038c2ae -
Trigger Event:
release
-
Statement type:
File details
Details for the file snap2text-0.2.0b1-py3-none-any.whl.
File metadata
- Download URL: snap2text-0.2.0b1-py3-none-any.whl
- Upload date:
- Size: 8.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f26a69fb9062967f6250d4502e7b575ce3bb7f87aa89b746e7db758389e98b49
|
|
| MD5 |
3c54b58b85be90771ff50670de86e067
|
|
| BLAKE2b-256 |
f1ca93e7319598fbd419f6ed198d3c59026e210ab46eaf909caf95e54a4b58ac
|
Provenance
The following attestation bundles were made for snap2text-0.2.0b1-py3-none-any.whl:
Publisher:
pypi-publish.yml on marcbadiam/snap2text
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
snap2text-0.2.0b1-py3-none-any.whl -
Subject digest:
f26a69fb9062967f6250d4502e7b575ce3bb7f87aa89b746e7db758389e98b49 - Sigstore transparency entry: 1065696864
- Sigstore integration time:
-
Permalink:
marcbadiam/snap2text@e8948d40ef33159d868efa23c9b11ed1e038c2ae -
Branch / Tag:
refs/tags/v0.2.0b1 - Owner: https://github.com/marcbadiam
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
pypi-publish.yml@e8948d40ef33159d868efa23c9b11ed1e038c2ae -
Trigger Event:
release
-
Statement type: