The Phantom Developer — AI agent that tests your repo's onboarding docs
Project description
ghost.dev
The Phantom Developer
Every repo claims easy setup. ghost.dev proves them wrong — or right.
An AI agent that simulates a brand-new developer onboarding onto any GitHub repository from absolute zero. It reads the docs, follows setup instructions step-by-step inside a fresh Docker container, and logs every failure, ambiguity, and undocumented assumption as a friction event.
The ghost doesn't know your framework. It doesn't know your conventions. If the README doesn't say "install Node," it won't install Node. It's the most literal, naive developer possible — and that's the point.
ghost run https://github.com/fastapi/fastapi --no-docker
✓ fastapi/fastapi ready
✓ 23 docs found (python)
✓ 4 steps (depth=quick)
──────────── 👻 ghost.dev — doc analysis ────────────
Repository fastapi/fastapi
Docs scanned 23
Project type python
Steps found 4
How It Works
Scanner ──▶ Planner ──▶ Executor ──▶ Observer ──▶ Reporter
reads docs AI gets runs in classifies terminal +
& configs steps Docker friction HTML report
│ │
▼ ▼
Recoverer Fix Gen
AI fixes suggests
failures doc edits
Six-stage pipeline. Each stage does one thing well:
- Scanner — Finds README, CONTRIBUTING, Makefile, package.json, CI workflows, env templates, version files. 23+ file patterns.
- Planner — AI extracts ordered setup steps from docs. Nothing else. Confidence scores, assumptions, prerequisites — all tracked.
- Executor — Runs each step in a bare
ubuntu:24.04Docker container. Sandboxed. Resource-limited. Timed. - Observer — Classifies each result: success, failure, ambiguity, partial. AI-powered with heuristic fallback.
- Recoverer — On failure, AI diagnoses the error and attempts self-recovery. Like a real dev reading stack traces.
- Reporter — Friction score, letter grade, cost estimate, timeline, fix suggestions. Terminal + self-contained HTML.
Quick Start
Zero-install — run it directly (like npx):
# Using uvx (recommended — fastest, no install)
uvx --from git+https://github.com/SujalXplores/ghost.dev.git ghost run https://github.com/user/repo
# Using pipx
pipx run --spec git+https://github.com/SujalXplores/ghost.dev.git ghost run https://github.com/user/repo
Or install it permanently:
pip install git+https://github.com/SujalXplores/ghost.dev.git
ghost run https://github.com/user/repo
First run prompts for an API key:
👻 No API key found. Let's set one up.
› 1. Anthropic (recommended)
2. OpenRouter (any model — Claude, GPT, Gemini, Llama)
3. OpenAI (GPT models)
Or run ghost setup anytime to add or change keys.
Usage
ghost run <repo_url_or_local_path> [options]
| Flag | Description |
|---|---|
--depth [quick|full] |
quick = build only, full = build + test + lint |
--timeout <minutes> |
Max time per step (default: 5) |
--model <name> |
AI model (default: claude-sonnet) |
--output <path> |
HTML report path |
--no-docker |
Doc analysis only — no execution |
--json-output |
Machine-readable JSON to stdout |
--fail-threshold <n> |
Exit code 2 if friction score > n (CI mode) |
--no-cache |
Bypass plan cache |
--quiet |
Minimal output |
--debug |
Debug logging |
--verbose |
Show container stdout/stderr |
# Analyze any repo's docs without Docker
ghost run ./my-project --no-docker
# Full pipeline in Docker
ghost run https://github.com/user/repo --depth full
# CI gate — fail if friction is too high
ghost run . --json-output --fail-threshold 40
# Use any model via OpenRouter
ghost run <url> --model google/gemini-3.1-pro
ghost run <url> --model deepseek/deepseek-v3.2
ghost run <url> --model meta-llama/llama-4-maverick
Other Commands
ghost setup # Configure API keys interactively
ghost clean # Remove cached plans + orphaned containers
ghost --version # Print version
Output
ghost.dev produces a friction report with:
- Friction Score (0–100) and letter grade (A+ to F)
- Execution timeline — every step with pass/fail, duration, exit code
- Friction events — severity, root cause, category, suggested fix
- Self-recovery log — what the ghost tried, what worked
- Cost estimate — developer hours wasted × $100/hr × team size
- Fix suggestions — concrete doc edits with before/after diffs
- HTML report — self-contained, dark theme, print-ready, accessible
A sample report from running against FastAPI is included in demo/fastapi-report.html.
The Ghost Philosophy
The agent behaves like a literal, naive first-day intern:
- It ONLY knows what the docs tell it
- It does NOT use prior knowledge about frameworks
- If the README doesn't say "install Node," it won't install Node
- If something fails, it attempts self-recovery (like a real dev Googling)
- Every assumption the docs make is logged as implicit knowledge
Architecture
ghost/
├── cli.py # Click CLI with Rich-styled help
├── config.py # Config, API key management, cache
├── core/
│ ├── _ai.py # Multi-provider AI (Anthropic → OpenRouter → OpenAI)
│ ├── _utils.py # Shared JSON parsing, command safety validation
│ ├── scanner.py # Repo doc scanner (23+ file patterns)
│ ├── planner.py # AI step extraction with SQLite caching
│ ├── executor.py # Docker execution engine with env injection
│ ├── observer.py # Result classification (AI + heuristic fallback)
│ └── recoverer.py # AI-powered self-recovery with safety checks
├── docker/
│ ├── container.py # Container lifecycle, exec, resource limits
│ └── Dockerfile.ghost
├── models/
│ ├── step.py # SetupStep, PlanResult (Pydantic)
│ ├── friction.py # FrictionEvent model
│ └── report.py # GhostReport with scoring, grading, cost estimation
├── reporter/
│ ├── terminal.py # Rich terminal UI — panels, tables, progress
│ ├── html.py # Jinja2 HTML report generator
│ └── templates/
│ └── report.html # Self-contained dark-theme HTML template
└── fixgen/
└── suggestions.py # AI-generated documentation fix suggestions
Engineering Highlights
Multi-provider AI with automatic fallback Anthropic → OpenRouter → OpenAI. If one fails, the next picks up. OpenRouter gives access to any model — Claude, GPT, Gemini, Llama, DeepSeek — through a single key.
Security-first execution
- Docker containers are sandboxed with 2GB RAM / 2 CPU limits
- AI-generated recovery commands are validated against a dangerous-pattern blocklist before execution
- No
curl | bash, norm -rf /, no fork bombs — even inside the container shlex.quotefor all shell arguments
Intelligent caching
Plans are cached in SQLite with a 24-hour TTL. Same repo + same model = instant results on re-run. --no-cache to bypass.
Graceful degradation
No Docker? --no-docker gives full doc analysis. No API key? Interactive setup on first run. AI returns garbage JSON? Heuristic fallback classifies results without AI.
CI/CD ready
--json-output for machine-readable reports. --fail-threshold 40 exits with code 2 if friction is too high. Gate your PRs on documentation quality.
141 tests, zero warnings Unit tests for every model, every heuristic, every parser. Integration tests for the CLI. Mocked AI calls for the execution pipeline. All passing.
Tech Stack
click · rich · docker · anthropic · openai · gitpython · jinja2 · pydantic · python-dotenv
Dev: pytest · pytest-cov · ruff
Requirements
- Python 3.11+
- Docker (or
--no-dockerfor analysis-only) - API key: Anthropic, OpenRouter, or OpenAI (prompted on first run)
License
MIT
Built for DX-RAY 2026 👻
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ghost_dev-0.1.0.tar.gz.
File metadata
- Download URL: ghost_dev-0.1.0.tar.gz
- Upload date:
- Size: 47.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6ea0777c9ef39b8a17b0024ae4ee2d394728d8d901b20167593a5562b5b81adc
|
|
| MD5 |
98214823a6255de414ea0072e7d40b93
|
|
| BLAKE2b-256 |
af527ce5abaab3f9f8c50e2a29b8b50c235700c53b1e2b6901fc2c577586c3d6
|
File details
Details for the file ghost_dev-0.1.0-py3-none-any.whl.
File metadata
- Download URL: ghost_dev-0.1.0-py3-none-any.whl
- Upload date:
- Size: 42.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f7450d9c2a96ebc398b5e0647be4f0d55509b3cd96e7ab0047d95afd0e98591e
|
|
| MD5 |
63c3d8897359eb1c5c2ee50732c4f8d6
|
|
| BLAKE2b-256 |
713bf6a11ff3661151daf90a393fd5064ae1b3f8dcb3c89901c527e5b07e3d8b
|