Skip to main content

AI browser agent that costs 50x less. DOM + OCR + Memory. Works with any LLM.

Project description

Ghost

Ghost

AI browser agent that controls your computer.
Type a task. Ghost does it. DOM + OCR + Memory.

PyPI Stars License


Install & Run

pip install ghostagent
export OPENROUTER_API_KEY="your-key-here"
ghost
╔══════════════════════════════════════════╗
║  👻 Ghost v0.2.0                         ║
║  AI browser agent. DOM + OCR + Memory.   ║
║  Type a task. Ghost does it.             ║
╚══════════════════════════════════════════╝

  Permissions
    ◉ Browser Control    Open, navigate, click, type in Chrome
    ◉ Screen Reading     Capture screenshots, read text via OCR
    ◉ File System        Read/write files, manage downloads
    ◉ Clipboard          Read/write system clipboard
    ◉ App Management     Open, close, switch applications
    ◉ Keyboard/Mouse     Click, type, scroll, hotkeys

  Grant all permissions? [y/n]: y
  ✓ All permissions granted.
  ✓ Ghost is ready.

  ghost> Sign into Upwork with Google using rohit@gmail.com
    Step 1: NAVIGATE upwork.com/login
    Step 2: CLICK "Continue with Google"
    Step 3: CLICK "rohit@gmail.com"
    ✓ Signed into Upwork

  ghost> Go to HackerNews and get the top 5 stories
    ✓ 1. Why so many control rooms were seafoam green
      2. Show HN: I built a search engine for RSS feeds
      3. ...

  ghost> Convert my PDF at ~/Downloads/report.pdf to DOCX
    Step 1: NAVIGATE cloudconvert.com
    Step 2: CLICK "Select File"
    [DIALOG] File picker detected → Downloads → report.pdf → Open
    Step 3: CLICK "Convert"
    Step 4: CLICK "Download"
    ✓ Saved report.docx to Downloads

  ghost> /quit
  👻 Ghost vanishes...

What is Ghost?

Ghost is a terminal agent — like Claude Code but for browser tasks. You type what you want, Ghost opens Chrome and does it on your real system with your real cookies, logins, and sessions.

No sandboxed browser. No Playwright. No Selenium. Your actual Chrome.


Why Ghost?

Every other browser agent sends screenshots to a vision model for every action. Ghost reads the DOM as text — 50x cheaper, faster, more accurate.

Cost Comparison

Accuracy Comparison


How It Works

How Ghost Works

Them:   Screenshot → Vision LLM ($$$) → guess pixel coordinates → click

Ghost:  Read DOM as text → LLM picks element ID → exact click
        + OCR catches popups, dialogs, overlays
        + Memory replays known tasks at zero cost

Three perception layers, cheapest first

Perception Layers

Token Usage


Commands

Command What it does
ghost Start Ghost
/help Show all commands
/model [name] Switch LLM (e.g., /model openai/gpt-4o)
/memory Show what Ghost remembers
/tasks Show completed tasks
/tabs List open browser tabs
/screenshot Save current screen
/quit Exit
[anything else] Ghost executes it as a task

Use any LLM

Switch models on the fly inside Ghost:

ghost> /model anthropic/claude-sonnet-4-6
✓ Switched to claude-sonnet-4-6

ghost> /model openai/gpt-4.6
✓ Switched to gpt-4.6

ghost> /model google/gemini-3.1-pro
✓ Switched to gemini-3.1-pro

ghost> /model google/gemini-2.5-flash
✓ Switched to gemini-2.5-flash (cheapest)

Works with every model on OpenRouter.


What Ghost can do

Data extraction

ghost> Go to Y Combinator's top companies and get the top 10 names

Form filling

ghost> Go to example.com/signup and fill: name=Rohit, email=rohit@example.com

Authenticated workflows (uses your real Chrome cookies)

ghost> Go to my Gmail and find the latest email from Amazon

File uploads & downloads

ghost> Upload ~/Downloads/report.pdf to ilovepdf.com, convert to DOCX, download it

Google sign-in flows

ghost> Sign into Upwork with Google using rohit@gmail.com

Multi-step research

ghost> Search Google for "best AI agents 2026", get top 5 results, save to ~/research.txt

Memory

Ghost remembers across sessions. Ask it something once, it never forgets.

ghost> /memory
┌─────────────────────────────────────────┐
│ MEMORY.md                               │
│                                         │
│ - Upwork login uses Google OAuth        │
│ - Safari icon is 5th in dock            │
│ - User prefers Claude Sonnet 4.6        │
└─────────────────────────────────────────┘

Task Replay: First time costs $0.003. Second time costs $0.000.


Benchmarks

Benchmarks

Agent Approach Cost/task Accuracy Memory
Claude Computer Use Screenshots → VLM $0.10-5.00 ~85% No
OpenAI Operator Screenshots → VLM $0.10-5.00 ~85% No
Browser Use Playwright + LLM $0.01-0.05 ~89% No
Ghost DOM + OCR + text LLM $0.003 ~99% Yes

Permissions

When Ghost starts, it asks for permission to:

Permission Why Ghost needs it
Browser Control Navigate, click, type in Chrome via DevTools Protocol
Screen Reading OCR for popups, dialogs, and overlays outside the DOM
File System Read/write files, handle downloads
Clipboard Copy/paste data between apps
App Management Open/close/fullscreen Chrome
Keyboard/Mouse Physical input when DOM clicks aren't enough

Ghost runs on your machine. Nothing is sent to any server except LLM API calls (your chosen provider).


Fresh Mac Setup (step by step)

1. Install Python (if you don't have it)

# Check if Python is installed
python3 --version

# If not, install via Homebrew
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
brew install python

2. Install Chrome (if you don't have it)

Download from google.com/chrome and sign into your Google account.

3. Install Ghost

pip install ghostagent

4. Run Ghost

ghost

5. First-time setup (Ghost walks you through it)

╔══════════════════════════════════════════╗
║  👻 Ghost v0.2.0                         ║
╚══════════════════════════════════════════╝

  First-time Setup

  Ghost needs an API key to talk to AI models.
  Get a free key at https://openrouter.ai

  Enter your OpenRouter API key: sk-or-v1-xxxxx
  ✓ API key saved. You won't need to enter this again.

  Permissions
    ◉ Browser Control
    ◉ Screen Reading
    ◉ File System
    ◉ Clipboard
    ◉ App Management
    ◉ Keyboard/Mouse

  Grant all permissions? [y/n]: y
  ✓ Ghost is ready.

  ghost> _

6. macOS permissions (one-time)

Ghost needs Accessibility permission to control mouse/keyboard:

  1. Go to System Settings → Privacy & Security → Accessibility
  2. Add your terminal app (Terminal, iTerm2, or VS Code)
  3. Toggle it ON

That's it. Ghost auto-syncs your Chrome profile (cookies, logins, everything).


Settings

Ghost saves settings to ~/.ghost/config.json. Edit anytime:

ghost> /config
  API Key     sk-or-v1...fc283
  Model       anthropic/claude-sonnet-4
  Provider    openrouter
  Config      ~/.ghost/config.json

ghost> /config key sk-or-v1-your-new-key
✓ API key updated

ghost> /config model openai/gpt-4o
✓ Model updated

ghost> /model google/gemini-2.5-flash
✓ Switched to gemini-2.5-flash (saved)

Requirements

  • Python 3.10+
  • Google Chrome (signed in with your account)
  • OpenRouter API key — free to start, access to every AI model

Install from source

git clone https://github.com/rohitmenonhart-xhunter/Ghost.git
cd Ghost
pip install -e .
ghost

License

Apache 2.0


Built by Rohit Menon

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ghostagent-0.3.0.tar.gz (81.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ghostagent-0.3.0-py3-none-any.whl (89.9 kB view details)

Uploaded Python 3

File details

Details for the file ghostagent-0.3.0.tar.gz.

File metadata

  • Download URL: ghostagent-0.3.0.tar.gz
  • Upload date:
  • Size: 81.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for ghostagent-0.3.0.tar.gz
Algorithm Hash digest
SHA256 96ad21a8ed8e554db376ff77f68b017801a26ec783cccd776fdb9a651d2fa74d
MD5 1d691ba3884accbe9244a75fa8774e74
BLAKE2b-256 cbb55271526728e40f628a8e751ff6ce736e2a92c640c7a4200f495072cab1df

See more details on using hashes here.

File details

Details for the file ghostagent-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: ghostagent-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 89.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for ghostagent-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 48c84d92a83d3459f0d8608cf6765cb0e619a320bf97f34ff502ad87f0553f33
MD5 cca5d78af517d5d96039acedf77de057
BLAKE2b-256 cac8afdd7eb9651577190722c1ba47a7900fb7d70dc466fccb76ed9f28eee9e8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page