Skip to main content

The Phantom Developer — AI agent that tests your repo's onboarding docs

Project description

ghost.dev

The Phantom Developer
Every repo claims easy setup. ghost.dev proves them wrong — or right.


An AI agent that simulates a brand-new developer onboarding onto any GitHub repository from absolute zero. It reads the docs, follows setup instructions step-by-step inside a fresh Docker container, and logs every failure, ambiguity, and undocumented assumption as a friction event.

The ghost doesn't know your framework. It doesn't know your conventions. If the README doesn't say "install Node," it won't install Node. It's the most literal, naive developer possible — and that's the point.

ghost run https://github.com/fastapi/fastapi --no-docker
  ✓ fastapi/fastapi ready
  ✓ 23 docs found (python)
  ✓ 4 steps (depth=quick)

──────────── 👻 ghost.dev — doc analysis ────────────

 Repository      fastapi/fastapi
 Docs scanned    23
 Project type    python
 Steps found     4

How It Works

Scanner ──▶ Planner ──▶ Executor ──▶ Observer ──▶ Reporter
reads docs   AI gets    runs in     classifies    terminal +
& configs    steps      Docker      friction      HTML report
                │              │
                ▼              ▼
           Recoverer      Fix Gen
           AI fixes       suggests
           failures       doc edits

Six-stage pipeline. Each stage does one thing well:

  1. Scanner — Finds README, CONTRIBUTING, Makefile, package.json, CI workflows, env templates, version files. 23+ file patterns.
  2. Planner — AI extracts ordered setup steps from docs. Nothing else. Confidence scores, assumptions, prerequisites — all tracked.
  3. Executor — Runs each step in a bare ubuntu:24.04 Docker container. Sandboxed. Resource-limited. Timed.
  4. Observer — Classifies each result: success, failure, ambiguity, partial. AI-powered with heuristic fallback.
  5. Recoverer — On failure, AI diagnoses the error and attempts self-recovery. Like a real dev reading stack traces.
  6. Reporter — Friction score, letter grade, cost estimate, timeline, fix suggestions. Terminal + self-contained HTML.

Quick Start

Zero-install — run it directly (like npx):

# Using uvx (recommended — fastest, no install)
uvx --from git+https://github.com/SujalXplores/ghost.dev.git ghost run https://github.com/user/repo

# Using pipx
pipx run --spec git+https://github.com/SujalXplores/ghost.dev.git ghost run https://github.com/user/repo

Or install it permanently:

pip install git+https://github.com/SujalXplores/ghost.dev.git
ghost run https://github.com/user/repo

First run prompts for an API key:

👻 No API key found. Let's set one up.

  › 1. Anthropic  (recommended)
    2. OpenRouter  (any model — Claude, GPT, Gemini, Llama)
    3. OpenAI      (GPT models)

Or run ghost setup anytime to add or change keys.

Usage

ghost run <repo_url_or_local_path> [options]
Flag Description
--depth [quick|full] quick = build only, full = build + test + lint
--timeout <minutes> Max time per step (default: 5)
--model <name> AI model (default: claude-sonnet)
--output <path> HTML report path
--no-docker Doc analysis only — no execution
--json-output Machine-readable JSON to stdout
--fail-threshold <n> Exit code 2 if friction score > n (CI mode)
--no-cache Bypass plan cache
--quiet Minimal output
--debug Debug logging
--verbose Show container stdout/stderr
# Analyze any repo's docs without Docker
ghost run ./my-project --no-docker

# Full pipeline in Docker
ghost run https://github.com/user/repo --depth full

# CI gate — fail if friction is too high
ghost run . --json-output --fail-threshold 40

# Use any model via OpenRouter
ghost run <url> --model google/gemini-3.1-pro
ghost run <url> --model deepseek/deepseek-v3.2
ghost run <url> --model meta-llama/llama-4-maverick

Other Commands

ghost setup          # Configure API keys interactively
ghost clean          # Remove cached plans + orphaned containers
ghost --version      # Print version

Output

ghost.dev produces a friction report with:

  • Friction Score (0–100) and letter grade (A+ to F)
  • Execution timeline — every step with pass/fail, duration, exit code
  • Friction events — severity, root cause, category, suggested fix
  • Self-recovery log — what the ghost tried, what worked
  • Cost estimate — developer hours wasted × $100/hr × team size
  • Fix suggestions — concrete doc edits with before/after diffs
  • HTML report — self-contained, dark theme, print-ready, accessible

A sample report from running against FastAPI is included in demo/fastapi-report.html.

The Ghost Philosophy

The agent behaves like a literal, naive first-day intern:

  • It ONLY knows what the docs tell it
  • It does NOT use prior knowledge about frameworks
  • If the README doesn't say "install Node," it won't install Node
  • If something fails, it attempts self-recovery (like a real dev Googling)
  • Every assumption the docs make is logged as implicit knowledge

Architecture

ghost/
├── cli.py              # Click CLI with Rich-styled help
├── config.py           # Config, API key management, cache
├── core/
│   ├── _ai.py          # Multi-provider AI (Anthropic → OpenRouter → OpenAI)
│   ├── _utils.py       # Shared JSON parsing, command safety validation
│   ├── scanner.py      # Repo doc scanner (23+ file patterns)
│   ├── planner.py      # AI step extraction with SQLite caching
│   ├── executor.py     # Docker execution engine with env injection
│   ├── observer.py     # Result classification (AI + heuristic fallback)
│   └── recoverer.py    # AI-powered self-recovery with safety checks
├── docker/
│   ├── container.py    # Container lifecycle, exec, resource limits
│   └── Dockerfile.ghost
├── models/
│   ├── step.py         # SetupStep, PlanResult (Pydantic)
│   ├── friction.py     # FrictionEvent model
│   └── report.py       # GhostReport with scoring, grading, cost estimation
├── reporter/
│   ├── terminal.py     # Rich terminal UI — panels, tables, progress
│   ├── html.py         # Jinja2 HTML report generator
│   └── templates/
│       └── report.html # Self-contained dark-theme HTML template
└── fixgen/
    └── suggestions.py  # AI-generated documentation fix suggestions

Engineering Highlights

Multi-provider AI with automatic fallback Anthropic → OpenRouter → OpenAI. If one fails, the next picks up. OpenRouter gives access to any model — Claude, GPT, Gemini, Llama, DeepSeek — through a single key.

Security-first execution

  • Docker containers are sandboxed with 2GB RAM / 2 CPU limits
  • AI-generated recovery commands are validated against a dangerous-pattern blocklist before execution
  • No curl | bash, no rm -rf /, no fork bombs — even inside the container
  • shlex.quote for all shell arguments

Intelligent caching Plans are cached in SQLite with a 24-hour TTL. Same repo + same model = instant results on re-run. --no-cache to bypass.

Graceful degradation No Docker? --no-docker gives full doc analysis. No API key? Interactive setup on first run. AI returns garbage JSON? Heuristic fallback classifies results without AI.

CI/CD ready --json-output for machine-readable reports. --fail-threshold 40 exits with code 2 if friction is too high. Gate your PRs on documentation quality.

141 tests, zero warnings Unit tests for every model, every heuristic, every parser. Integration tests for the CLI. Mocked AI calls for the execution pipeline. All passing.

Tech Stack

click · rich · docker · anthropic · openai · gitpython · jinja2 · pydantic · python-dotenv

Dev: pytest · pytest-cov · ruff

Requirements

  • Python 3.11+
  • Docker (or --no-docker for analysis-only)
  • API key: Anthropic, OpenRouter, or OpenAI (prompted on first run)

License

MIT


Built for DX-RAY 2026 👻

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ghost_dev-0.1.0.tar.gz (47.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ghost_dev-0.1.0-py3-none-any.whl (42.4 kB view details)

Uploaded Python 3

File details

Details for the file ghost_dev-0.1.0.tar.gz.

File metadata

  • Download URL: ghost_dev-0.1.0.tar.gz
  • Upload date:
  • Size: 47.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.12

File hashes

Hashes for ghost_dev-0.1.0.tar.gz
Algorithm Hash digest
SHA256 6ea0777c9ef39b8a17b0024ae4ee2d394728d8d901b20167593a5562b5b81adc
MD5 98214823a6255de414ea0072e7d40b93
BLAKE2b-256 af527ce5abaab3f9f8c50e2a29b8b50c235700c53b1e2b6901fc2c577586c3d6

See more details on using hashes here.

File details

Details for the file ghost_dev-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: ghost_dev-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 42.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.12

File hashes

Hashes for ghost_dev-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 f7450d9c2a96ebc398b5e0647be4f0d55509b3cd96e7ab0047d95afd0e98591e
MD5 63c3d8897359eb1c5c2ee50732c4f8d6
BLAKE2b-256 713bf6a11ff3661151daf90a393fd5064ae1b3f8dcb3c89901c527e5b07e3d8b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page