AI browser agent that costs 50x less. DOM + OCR + Memory. Works with any LLM.
Project description
Ghost
AI browser agent that controls your computer.
Type a task. Ghost does it. DOM + OCR + Memory.
Install & Run
pip install ghostagent
export OPENROUTER_API_KEY="your-key-here"
ghost
╔══════════════════════════════════════════╗
║ 👻 Ghost v0.2.0 ║
║ AI browser agent. DOM + OCR + Memory. ║
║ Type a task. Ghost does it. ║
╚══════════════════════════════════════════╝
Permissions
◉ Browser Control Open, navigate, click, type in Chrome
◉ Screen Reading Capture screenshots, read text via OCR
◉ File System Read/write files, manage downloads
◉ Clipboard Read/write system clipboard
◉ App Management Open, close, switch applications
◉ Keyboard/Mouse Click, type, scroll, hotkeys
Grant all permissions? [y/n]: y
✓ All permissions granted.
✓ Ghost is ready.
ghost> Sign into Upwork with Google using rohit@gmail.com
Step 1: NAVIGATE upwork.com/login
Step 2: CLICK "Continue with Google"
Step 3: CLICK "rohit@gmail.com"
✓ Signed into Upwork
ghost> Go to HackerNews and get the top 5 stories
✓ 1. Why so many control rooms were seafoam green
2. Show HN: I built a search engine for RSS feeds
3. ...
ghost> Convert my PDF at ~/Downloads/report.pdf to DOCX
Step 1: NAVIGATE cloudconvert.com
Step 2: CLICK "Select File"
[DIALOG] File picker detected → Downloads → report.pdf → Open
Step 3: CLICK "Convert"
Step 4: CLICK "Download"
✓ Saved report.docx to Downloads
ghost> /quit
👻 Ghost vanishes...
What is Ghost?
Ghost is a terminal agent — like Claude Code but for browser tasks. You type what you want, Ghost opens Chrome and does it on your real system with your real cookies, logins, and sessions.
No sandboxed browser. No Playwright. No Selenium. Your actual Chrome.
Why Ghost?
Every other browser agent sends screenshots to a vision model for every action. Ghost reads the DOM as text — 50x cheaper, faster, more accurate.
How It Works
Them: Screenshot → Vision LLM ($$$) → guess pixel coordinates → click
Ghost: Read DOM as text → LLM picks element ID → exact click
+ OCR catches popups, dialogs, overlays
+ Memory replays known tasks at zero cost
Three perception layers, cheapest first
Commands
| Command | What it does |
|---|---|
ghost |
Start Ghost |
/help |
Show all commands |
/model [name] |
Switch LLM (e.g., /model openai/gpt-4o) |
/memory |
Show what Ghost remembers |
/tasks |
Show completed tasks |
/tabs |
List open browser tabs |
/screenshot |
Save current screen |
/quit |
Exit |
| [anything else] | Ghost executes it as a task |
Use any LLM
Switch models on the fly inside Ghost:
ghost> /model anthropic/claude-sonnet-4-6
✓ Switched to claude-sonnet-4-6
ghost> /model openai/gpt-4.6
✓ Switched to gpt-4.6
ghost> /model google/gemini-3.1-pro
✓ Switched to gemini-3.1-pro
ghost> /model google/gemini-2.5-flash
✓ Switched to gemini-2.5-flash (cheapest)
Works with every model on OpenRouter.
What Ghost can do
Data extraction
ghost> Go to Y Combinator's top companies and get the top 10 names
Form filling
ghost> Go to example.com/signup and fill: name=Rohit, email=rohit@example.com
Authenticated workflows (uses your real Chrome cookies)
ghost> Go to my Gmail and find the latest email from Amazon
File uploads & downloads
ghost> Upload ~/Downloads/report.pdf to ilovepdf.com, convert to DOCX, download it
Google sign-in flows
ghost> Sign into Upwork with Google using rohit@gmail.com
Multi-step research
ghost> Search Google for "best AI agents 2026", get top 5 results, save to ~/research.txt
Memory
Ghost remembers across sessions. Ask it something once, it never forgets.
ghost> /memory
┌─────────────────────────────────────────┐
│ MEMORY.md │
│ │
│ - Upwork login uses Google OAuth │
│ - Safari icon is 5th in dock │
│ - User prefers Claude Sonnet 4.6 │
└─────────────────────────────────────────┘
Task Replay: First time costs $0.003. Second time costs $0.000.
Benchmarks
| Agent | Approach | Cost/task | Accuracy | Memory |
|---|---|---|---|---|
| Claude Computer Use | Screenshots → VLM | $0.10-5.00 | ~85% | No |
| OpenAI Operator | Screenshots → VLM | $0.10-5.00 | ~85% | No |
| Browser Use | Playwright + LLM | $0.01-0.05 | ~89% | No |
| Ghost | DOM + OCR + text LLM | $0.003 | ~99% | Yes |
Permissions
When Ghost starts, it asks for permission to:
| Permission | Why Ghost needs it |
|---|---|
| Browser Control | Navigate, click, type in Chrome via DevTools Protocol |
| Screen Reading | OCR for popups, dialogs, and overlays outside the DOM |
| File System | Read/write files, handle downloads |
| Clipboard | Copy/paste data between apps |
| App Management | Open/close/fullscreen Chrome |
| Keyboard/Mouse | Physical input when DOM clicks aren't enough |
Ghost runs on your machine. Nothing is sent to any server except LLM API calls (your chosen provider).
Fresh Mac Setup (step by step)
1. Install Python (if you don't have it)
# Check if Python is installed
python3 --version
# If not, install via Homebrew
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
brew install python
2. Install Chrome (if you don't have it)
Download from google.com/chrome and sign into your Google account.
3. Install Ghost
pip install ghostagent
4. Run Ghost
ghost
5. First-time setup (Ghost walks you through it)
╔══════════════════════════════════════════╗
║ 👻 Ghost v0.2.0 ║
╚══════════════════════════════════════════╝
First-time Setup
Ghost needs an API key to talk to AI models.
Get a free key at https://openrouter.ai
Enter your OpenRouter API key: sk-or-v1-xxxxx
✓ API key saved. You won't need to enter this again.
Permissions
◉ Browser Control
◉ Screen Reading
◉ File System
◉ Clipboard
◉ App Management
◉ Keyboard/Mouse
Grant all permissions? [y/n]: y
✓ Ghost is ready.
ghost> _
6. macOS permissions (one-time)
Ghost needs Accessibility permission to control mouse/keyboard:
- Go to System Settings → Privacy & Security → Accessibility
- Add your terminal app (Terminal, iTerm2, or VS Code)
- Toggle it ON
That's it. Ghost auto-syncs your Chrome profile (cookies, logins, everything).
Settings
Ghost saves settings to ~/.ghost/config.json. Edit anytime:
ghost> /config
API Key sk-or-v1...fc283
Model anthropic/claude-sonnet-4
Provider openrouter
Config ~/.ghost/config.json
ghost> /config key sk-or-v1-your-new-key
✓ API key updated
ghost> /config model openai/gpt-4o
✓ Model updated
ghost> /model google/gemini-2.5-flash
✓ Switched to gemini-2.5-flash (saved)
Requirements
- Python 3.10+
- Google Chrome (signed in with your account)
- OpenRouter API key — free to start, access to every AI model
Install from source
git clone https://github.com/rohitmenonhart-xhunter/Ghost.git
cd Ghost
pip install -e .
ghost
License
Apache 2.0
Built by Rohit Menon
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ghostagent-0.3.0.tar.gz.
File metadata
- Download URL: ghostagent-0.3.0.tar.gz
- Upload date:
- Size: 81.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
96ad21a8ed8e554db376ff77f68b017801a26ec783cccd776fdb9a651d2fa74d
|
|
| MD5 |
1d691ba3884accbe9244a75fa8774e74
|
|
| BLAKE2b-256 |
cbb55271526728e40f628a8e751ff6ce736e2a92c640c7a4200f495072cab1df
|
File details
Details for the file ghostagent-0.3.0-py3-none-any.whl.
File metadata
- Download URL: ghostagent-0.3.0-py3-none-any.whl
- Upload date:
- Size: 89.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
48c84d92a83d3459f0d8608cf6765cb0e619a320bf97f34ff502ad87f0553f33
|
|
| MD5 |
cca5d78af517d5d96039acedf77de057
|
|
| BLAKE2b-256 |
cac8afdd7eb9651577190722c1ba47a7900fb7d70dc466fccb76ed9f28eee9e8
|