Skip to main content

Screenshot OCR tool using Gemini AI API

Project description

Snap2Text

Screenshot OCR tool using Google Gemini AI - extracts text from screenshots directly to clipboard.

Description

Snap2Text is a lightweight command-line tool that allows you to capture any area of your screen and instantly extract text from it using Google's Gemini AI. The extracted text is automatically copied to your clipboard, ready to be pasted wherever you need it.

Features

  • Select screen area and get text in clipboard - Draw a selection rectangle around any text on your screen
  • Auto-detects available system tools - Works out of the box with whatever tools you have installed
  • Works with multiple screenshot tools - Supports maim, scrot, and gnome-screenshot
  • Multiple clipboard tools supported - Compatible with xclip, xsel, and wl-paste
  • Supports notification tools - Optional notifications via notify-send or kdialog

Model Information

This package uses gemini-2.5-flash-lite as the hardcoded model. This model was chosen for its optimal balance of speed, cost-efficiency, and OCR accuracy for screenshot text extraction tasks.

Changing the Model (Advanced Users)

To use a different Gemini model, edit the source code:

# Locate the package installation
python -c "import snap2text; print(snap2text.__file__)"

# Edit the __main__.py file and change the MODEL variable
# Default: MODEL = "gemini-2.5-flash-lite"
# Alternatives: "gemini-2.5-flash", "gemini-2.0-flash-lite", etc.

Available models can be found at: https://ai.google.dev/gemini-api/docs/models/gemini

Requirements

  • Python 3.8+
  • At least one screenshot tool: maim, scrot, or gnome-screenshot
  • At least one clipboard tool: xclip, xsel, or wl-paste
  • Optional: notify-send or kdialog for notifications

API Key Setup

This package is designed to work with Google AI Studio API keys.

Free Tier Limits

  • 250 requests per day
  • 10 requests per minute
  • Results are more than adequate for screenshot OCR tasks

Getting Your API Key

  1. Visit Google AI Studio API Keys
  2. Sign in with your Google account
  3. Click "Create API Key"
  4. Copy the generated key

Installation

pip install snap2text

Configuration

Set your Google AI Studio API key using one of the following methods:

Method 1: Environment Variable

export MY_GEMINI_API_KEY="your-api-key-here"

Add this line to your ~/.bashrc, ~/.zshrc, or equivalent shell configuration file to make it persistent.

Method 2: Secrets File

Create ~/.secrets/MY_GEMINI_API_KEY containing only your API key:

mkdir -p ~/.secrets
echo "your-api-key-here" > ~/.secrets/MY_GEMINI_API_KEY
chmod 600 ~/.secrets/MY_GEMINI_API_KEY

Usage

Command Line

Run the command directly:

snap2text

Then select the area of your screen containing the text you want to extract. The text will be automatically copied to your clipboard.

Keyboard Shortcut (Recommended)

The recommended way to use Snap2Text is by binding it to a keyboard shortcut:

  • Recommended shortcut: Ctrl+PrintScreen (most logical for screenshot tools)
  • Alternative: Super+Shift+S or any combination you prefer

Setting up a keyboard shortcut:

Linux Mint / Cinnamon:

  1. Open System Settings → Keyboard → Shortcuts → Custom Shortcuts
  2. Click "Add custom shortcut"
  3. Name: "Snap2Text"
  4. Command: snap2text
  5. Assign your preferred key combination

Ubuntu / GNOME:

  1. Open Settings → Keyboard → Custom Shortcuts
  2. Click "+" to add new shortcut
  3. Name: "Snap2Text"
  4. Command: snap2text
  5. Set your preferred keybinding

KDE Plasma:

  1. Open System Settings → Shortcuts → Custom Shortcuts
  2. Edit → New → Global Shortcut → Command/URL
  3. Name: "Snap2Text"
  4. Command: snap2text
  5. Assign your preferred trigger

System Dependencies

Install the required screenshot and clipboard tools for your distribution:

Ubuntu / Debian

sudo apt install maim xclip

Arch Linux

sudo pacman -S maim xclip

Fedora

sudo dnf install maim xclip

Optional Dependencies

For desktop notifications when text is copied:

# Ubuntu / Debian
sudo apt install libnotify-bin

# Arch Linux
sudo pacman -S libnotify

# Fedora
sudo dnf install libnotify

For KDE users who prefer kdialog:

# Ubuntu / Debian
sudo apt install kdialog

# Arch Linux
sudo pacman -S kdialog

# Fedora
sudo dnf install kdialog

Alternative Screenshot Tools

If you prefer other screenshot tools, Snap2Text will automatically detect and use them:

# Using scrot (Ubuntu/Debian)
sudo apt install scrot

# Using gnome-screenshot (Ubuntu/Debian)
sudo apt install gnome-screenshot

Alternative Clipboard Tools

Similarly, you can use alternative clipboard tools:

# Using xsel instead of xclip (Ubuntu/Debian)
sudo apt install xsel

# For Wayland users (Ubuntu/Debian)
sudo apt install wl-clipboard

Legal Disclaimer

This software is provided "AS IS" without warranty of any kind, express or implied.

By using Snap2Text, you acknowledge and agree to the following:

  • API Usage: You are solely responsible for your Google AI Studio API usage, including but not limited to:

    • Staying within free tier limits
    • Monitoring your API consumption
    • Any charges incurred from exceeding free tier limits or using paid plans
    • Compliance with Google's Terms of Service and API usage policies
  • No Liability: The authors and contributors of this package assume no responsibility for:

    • API misuse or abuse
    • Costs or charges from Google AI Studio
    • Data privacy or security issues related to API usage
    • Any damages arising from the use of this software
  • User Responsibility: It is your responsibility to:

    • Understand Google's AI Studio pricing and limits
    • Monitor your API usage
    • Secure your API key appropriately
    • Use the tool in compliance with all applicable laws and regulations

For the most current information about API limits and pricing, visit: https://ai.google.dev/pricing

General Disclaimer

The authors and contributors of this project are not responsible for any damage, data loss, or any other issues that may arise from the use of this software. Use it at your own risk. This software is provided under the MIT License without any warranty of any kind, express or implied. See the LICENSE file for full details.

⚠️ Beta Version Notice

This is a beta release (v0.2.0b1). This version may contain incomplete implementations and significant undocumented bugs. It is strongly recommended to avoid using this software in a real production or development environment. Features may change, break, or be removed without prior notice in future releases.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

snap2text-0.2.0b1.tar.gz (7.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

snap2text-0.2.0b1-py3-none-any.whl (8.2 kB view details)

Uploaded Python 3

File details

Details for the file snap2text-0.2.0b1.tar.gz.

File metadata

  • Download URL: snap2text-0.2.0b1.tar.gz
  • Upload date:
  • Size: 7.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for snap2text-0.2.0b1.tar.gz
Algorithm Hash digest
SHA256 078fce06f29b6c3f1096337801fe9bb6c45c7bb0ade822b7f0cba42c52597318
MD5 85118f8447244ca678e8ca3fdbd913cb
BLAKE2b-256 e18a69c6e1122d2b588e25b4613132fd2ae02a094bf3b47358d6d369a7e9f81b

See more details on using hashes here.

Provenance

The following attestation bundles were made for snap2text-0.2.0b1.tar.gz:

Publisher: pypi-publish.yml on marcbadiam/snap2text

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file snap2text-0.2.0b1-py3-none-any.whl.

File metadata

  • Download URL: snap2text-0.2.0b1-py3-none-any.whl
  • Upload date:
  • Size: 8.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for snap2text-0.2.0b1-py3-none-any.whl
Algorithm Hash digest
SHA256 f26a69fb9062967f6250d4502e7b575ce3bb7f87aa89b746e7db758389e98b49
MD5 3c54b58b85be90771ff50670de86e067
BLAKE2b-256 f1ca93e7319598fbd419f6ed198d3c59026e210ab46eaf909caf95e54a4b58ac

See more details on using hashes here.

Provenance

The following attestation bundles were made for snap2text-0.2.0b1-py3-none-any.whl:

Publisher: pypi-publish.yml on marcbadiam/snap2text

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page