Screen-context sidecar for coding agents

These details have not been verified by PyPI

Project links

Project description

Eye2byte

Screen-context sidecar for coding agents

Captures your screen, voice, and annotations, feeds them to any vision model, and produces structured Context Packs your coding agent can act on.

Screen / Voice / Annotations  -->  Vision Model + Whisper  -->  Context Pack  -->  Coding Agent

Features

Multi-monitor capture — active, specific (1/2/3), or all monitors at once
Voice narration — record, clean (noise removal + normalization), transcribe locally
Annotations — arrows, circles, rectangles, freehand, multi-line text on a frozen screenshot
Screen clips — record short videos, extract keyframes, analyze the sequence
Image optimization — auto resize + compress (~5x smaller, zero quality loss)
MCP server — coding agents query your screen directly via Model Context Protocol
Context Packs — structured output: goal, environment, errors, signals, next steps

Platforms

Platform	Screenshot	Voice	Annotation	Hotkeys
Windows	PowerShell .NET	ffmpeg	Pillow	Ctrl+Shift+1-5
macOS	screencapture	ffmpeg	Pillow	-
Linux	scrot/maim/flameshot	ffmpeg	Pillow	-
Android	ADB (Termux)	Termux:API	-	-

Setup

1. Install dependencies

pip install Pillow fastmcp       # Core + MCP server
pip install openai-whisper       # Local voice transcription (optional)
# ffmpeg is required for voice/clips — install via your package manager

2. Configure a vision provider

Eye2byte works with any vision model — local or cloud. Set your provider in ~/.eye2byte/config.json or the Settings UI:

Provider	Setup	Cost
Ollama (local)	Install Ollama, `ollama pull qwen3-vl:8b`	Free
Gemini	Set `GEMINI_API_KEY` in `.env`	Free tier (1000 req/day)
OpenRouter	Set `OPENROUTER_API_KEY` in `.env`	Free models available
Hyperbolic	Set `HYPERBOLIC_API_KEY` in `.env`	Pay per use

# .env file (project dir, cwd, or ~/.eye2byte/.env)
GEMINI_API_KEY=your-key-here
# or OPENROUTER_API_KEY=...
# or HYPERBOLIC_API_KEY=...

3. Run

python eye2byte.py capture              # Screenshot + analysis
python eye2byte.py capture --voice      # + voice narration
python eye2byte.py capture --mode window # Active window only
python eye2byte_ui.py                    # Launch control panel

Control Panel

python eye2byte_ui.py

A small always-on-top floating panel. Drag it anywhere. Global hotkeys work even when the panel isn't focused.

Global Hotkeys (Windows)

These work system-wide — no need to focus the Eye2byte window:

Hotkey	Action	Notes
`Ctrl+Shift+1`	Capture screenshot	Uses current mode (Full/Window/Region)
`Ctrl+Shift+2`	Annotate	Freezes screen, opens drawing overlay
`Ctrl+Shift+3`	Toggle voice recording	Press once to start, again to stop
`Ctrl+Shift+5`	Grab clipboard image	Analyzes whatever image is on your clipboard

All keyboard shortcuts are customizable from Settings > Keyboard Shortcuts.

Panel Controls

Control	Action
`Space` (hold)	Push-to-talk — hold to record, release to stop
Mode selector	Cycle between Full Screen / Window / Region
Settings	Configure provider, model, image quality, cleanup
Copy @path	Copy session path to clipboard for `@`-mentioning

Annotation Overlay

When you press Ctrl+Shift+2 or click Annotate, the screen freezes and you can draw on it:

Key	Tool	How to use
`X`	Arrow	Click and drag to draw an arrow
`C`	Circle	Click and drag to draw an ellipse
`V`	Rectangle	Click and drag to draw a box
`B`	Freehand	Click and drag to draw freely
`T`	Text	Click to place, type your text

Action	How
Save	`Enter` (commits annotations and sends to vision model)
Cancel	`Escape` (discards all annotations)
Undo	Right-click near an annotation to remove it
Newline in text	`Shift+Enter` (Enter alone commits the text)
Multi-line text	Text box auto-grows up to 6 lines

Voice Recording

Three ways to record voice:

Toggle — Ctrl+Shift+3 starts recording, press again to stop
Push-to-talk — Hold Space while panel is focused
Mouse PTT — Hold click on the Record button

While recording, any captures you take are automatically bundled with the voice note into a single session.

MCP Server

Eye2byte exposes 6 tools via the Model Context Protocol, letting coding agents capture and analyze your screen directly.

Tool	Description
`capture_and_summarize`	Screenshot + vision analysis. Supports monitor selection, delay, window targeting
`capture_with_voice`	Screenshot + voice recording + transcription + analysis
`record_clip_and_summarize`	Screen clip with keyframe extraction and sequence analysis
`summarize_screenshot`	Analyze an existing image file
`transcribe_audio`	Local Whisper transcription of any audio file
`get_recent_context`	Retrieve recent Context Pack summaries

Local Setup (stdio)

Eye2byte runs on the machine whose screen you want to capture. For local agents like Claude Code on the same machine, use stdio transport:

Claude Code — add to your project's .mcp.json:

{
  "mcpServers": {
    "eye2byte": {
      "command": "python",
      "args": ["C:/path/to/eye2byte_mcp.py"]
    }
  }
}

That's it — Claude Code will auto-start the server. Use full absolute paths.

Remote Setup (SSE)

When your coding agent runs on a different machine (cloud VM, SSH dev box, CI runner) but needs to see your local screen, use SSE transport:

Step 1 — On your local machine (the one with the screen):

# Install Eye2byte + dependencies
pip install Pillow fastmcp
pip install openai-whisper  # optional, for voice

# Start the SSE server
python eye2byte_mcp.py --sse                           # No auth (LAN only)
python eye2byte_mcp.py --sse --token mysecret123       # Bearer token auth
python eye2byte_mcp.py --sse --port 9000 --token abc   # Custom port + auth

The server stays running and accepts connections from any machine on your network. Use --token when the server is reachable beyond your trusted LAN.

Step 2 — On the remote machine (where the coding agent runs):

Nothing to install. Just configure the MCP client to point at your local IP:

{
  "mcpServers": {
    "eye2byte": {
      "url": "http://YOUR_LOCAL_IP:8808/sse",
      "headers": {"Authorization": "Bearer mysecret123"}
    }
  }
}

Omit the headers field if the server was started without --token.

Find your local IP: ipconfig (Windows) or ifconfig / ip addr (Linux/macOS).

Firewall: You may need to allow inbound TCP on port 8808. On Windows, run as admin:

netsh advfirewall firewall add rule name="Eye2byte MCP" dir=in action=allow protocol=TCP localport=8808

Multi-monitor Examples

capture_and_summarize(monitor=0)    # active monitor (default)
capture_and_summarize(monitor=1)    # first monitor
capture_and_summarize(monitor=2)    # second monitor
capture_and_summarize(monitor=-1)   # ALL monitors at once

Context Pack Format

Every analysis produces a structured Context Pack:

## Goal         — what the user appears to be doing
## Environment  — OS, editor, repo, branch, language
## Screen State — visible panels, files, terminal output
## Signals      — verbatim errors, stack traces, warnings
## Likely Situation — what's probably happening
## Suggested Next Info — what a coding agent needs next

Configuration

Config: ~/.eye2byte/config.json (created on first run or via python eye2byte.py init)

Setting	Default	Description
`provider`	`"ollama"`	Vision provider: ollama, gemini, openrouter, hyperbolic
`model`	`"auto"`	Model name or "auto" for auto-detection
`voice_clean`	`true`	Noise removal + pause trimming + volume normalization
`auto_cleanup_days`	`7`	Delete old captures/summaries after N days (0=disabled)
`image_max_size`	`1920`	Max image dimension before LLM processing
`image_quality`	`90`	JPEG quality (1-100)

Files

File	Purpose
`eye2byte.py`	Core engine — capture, voice, clip, summarize, watch
`eye2byte_ui.py`	Control panel with hotkeys and annotation overlay
`eye2byte_mcp.py`	MCP server for coding agent integration

License

MIT

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.7.0

Mar 4, 2026

0.4.0

Feb 27, 2026

0.3.1

Feb 27, 2026

This version

0.3.0

Feb 27, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

eye2byte-0.3.0.tar.gz (52.7 kB view details)

Uploaded Feb 27, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

eye2byte-0.3.0-py3-none-any.whl (54.2 kB view details)

Uploaded Feb 27, 2026 Python 3

File details

Details for the file eye2byte-0.3.0.tar.gz.

File metadata

Download URL: eye2byte-0.3.0.tar.gz
Upload date: Feb 27, 2026
Size: 52.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.4

File hashes

Hashes for eye2byte-0.3.0.tar.gz
Algorithm	Hash digest
SHA256	`23ddd88027da89247a46ec99c38dcddb487715bf38a04a0169193f7c945beef4`
MD5	`1bcd2b598386020200a5f3d6031854ea`
BLAKE2b-256	`fb5e7e2a656218e12733320eca32bef37b170d8bae47c112a708cf78958f5089`

See more details on using hashes here.

File details

Details for the file eye2byte-0.3.0-py3-none-any.whl.

File metadata

Download URL: eye2byte-0.3.0-py3-none-any.whl
Upload date: Feb 27, 2026
Size: 54.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.4

File hashes

Hashes for eye2byte-0.3.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`a984e0c9ce3f1371e223a4c497e2effc5963112b8c59ec0d07822e7f47b1ea9d`
MD5	`f58ddd48c90fe5f9aa3399cccd5806a0`
BLAKE2b-256	`31b3891975f3f161d9c7e7fb34dfb5ffc18c6edb8c7cf84b2625424d82ba1c52`

See more details on using hashes here.

eye2byte 0.3.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Eye2byte

Features

Platforms

Setup

1. Install dependencies

2. Configure a vision provider

3. Run

Control Panel

Global Hotkeys (Windows)

Panel Controls

Annotation Overlay

Voice Recording

MCP Server

Local Setup (stdio)

Remote Setup (SSE)

Multi-monitor Examples

Context Pack Format

Configuration

Files

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes