The Phantom Developer — AI agent that tests your repo's onboarding docs

These details have not been verified by PyPI

Project links

Project description

ghost.dev

The Phantom Developer
Every repo claims easy setup. ghost.dev proves them wrong — or right.

An AI agent that simulates a brand-new developer onboarding onto any GitHub repository from absolute zero. It reads the docs, follows setup instructions step-by-step inside a fresh Docker container, and logs every failure, ambiguity, and undocumented assumption as a friction event.

The ghost doesn't know your framework. It doesn't know your conventions. If the README doesn't say "install Node," it won't install Node. It's the most literal, naive developer possible — and that's the point.

ghost run https://github.com/fastapi/fastapi --no-docker

  ✓ fastapi/fastapi ready
  ✓ 23 docs found (python)
  ✓ 4 steps (depth=quick)

──────────── 👻 ghost.dev — doc analysis ────────────

 Repository      fastapi/fastapi
 Docs scanned    23
 Project type    python
 Steps found     4

How It Works

Scanner ──▶ Planner ──▶ Executor ──▶ Observer ──▶ Reporter
reads docs   AI gets    runs in     classifies    terminal +
& configs    steps      Docker      friction      HTML report
                │              │
                ▼              ▼
           Recoverer      Fix Gen
           AI fixes       suggests
           failures       doc edits

Six-stage pipeline. Each stage does one thing well:

Scanner — Finds README, CONTRIBUTING, Makefile, package.json, CI workflows, env templates, version files. 23+ file patterns.
Planner — AI extracts ordered setup steps from docs. Nothing else. Confidence scores, assumptions, prerequisites — all tracked.
Executor — Runs each step in a bare ubuntu:24.04 Docker container. Sandboxed. Resource-limited. Timed.
Observer — Classifies each result: success, failure, ambiguity, partial. AI-powered with heuristic fallback.
Recoverer — On failure, AI diagnoses the error and attempts self-recovery. Like a real dev reading stack traces.
Reporter — Friction score, letter grade, cost estimate, timeline, fix suggestions. Terminal + self-contained HTML.

Quick Start

Zero-install — run it directly (like npx):

# Using uvx (recommended — fastest, no install)
uvx --from git+https://github.com/SujalXplores/ghost.dev.git ghost run https://github.com/user/repo

# Using pipx
pipx run --spec git+https://github.com/SujalXplores/ghost.dev.git ghost run https://github.com/user/repo

Or install it permanently:

pip install git+https://github.com/SujalXplores/ghost.dev.git
ghost run https://github.com/user/repo

First run prompts for an API key:

👻 No API key found. Let's set one up.

  › 1. Anthropic  (recommended)
    2. OpenRouter  (any model — Claude, GPT, Gemini, Llama)
    3. OpenAI      (GPT models)

Or run ghost setup anytime to add or change keys.

Usage

ghost run <repo_url_or_local_path> [options]

Flag	Description
`--depth [quick\|full]`	`quick` = build only, `full` = build + test + lint
`--timeout <minutes>`	Max time per step (default: 5)
`--model <name>`	AI model (default: claude-sonnet)
`--output <path>`	HTML report path
`--no-docker`	Doc analysis only — no execution
`--json-output`	Machine-readable JSON to stdout
`--fail-threshold <n>`	Exit code 2 if friction score > n (CI mode)
`--no-cache`	Bypass plan cache
`--quiet`	Minimal output
`--debug`	Debug logging
`--verbose`	Show container stdout/stderr

# Analyze any repo's docs without Docker
ghost run ./my-project --no-docker

# Full pipeline in Docker
ghost run https://github.com/user/repo --depth full

# CI gate — fail if friction is too high
ghost run . --json-output --fail-threshold 40

# Use any model via OpenRouter
ghost run <url> --model google/gemini-3.1-pro
ghost run <url> --model deepseek/deepseek-v3.2
ghost run <url> --model meta-llama/llama-4-maverick

Other Commands

ghost setup          # Configure API keys interactively
ghost clean          # Remove cached plans + orphaned containers
ghost --version      # Print version

Output

ghost.dev produces a friction report with:

Friction Score (0–100) and letter grade (A+ to F)
Execution timeline — every step with pass/fail, duration, exit code
Friction events — severity, root cause, category, suggested fix
Self-recovery log — what the ghost tried, what worked
Cost estimate — developer hours wasted × $100/hr × team size
Fix suggestions — concrete doc edits with before/after diffs
HTML report — self-contained, dark theme, print-ready, accessible

A sample report from running against FastAPI is included in demo/fastapi-report.html.

The Ghost Philosophy

The agent behaves like a literal, naive first-day intern:

It ONLY knows what the docs tell it
It does NOT use prior knowledge about frameworks
If the README doesn't say "install Node," it won't install Node
If something fails, it attempts self-recovery (like a real dev Googling)
Every assumption the docs make is logged as implicit knowledge

Architecture

ghost/
├── cli.py              # Click CLI with Rich-styled help
├── config.py           # Config, API key management, cache
├── core/
│   ├── _ai.py          # Multi-provider AI (Anthropic → OpenRouter → OpenAI)
│   ├── _utils.py       # Shared JSON parsing, command safety validation
│   ├── scanner.py      # Repo doc scanner (23+ file patterns)
│   ├── planner.py      # AI step extraction with SQLite caching
│   ├── executor.py     # Docker execution engine with env injection
│   ├── observer.py     # Result classification (AI + heuristic fallback)
│   └── recoverer.py    # AI-powered self-recovery with safety checks
├── docker/
│   ├── container.py    # Container lifecycle, exec, resource limits
│   └── Dockerfile.ghost
├── models/
│   ├── step.py         # SetupStep, PlanResult (Pydantic)
│   ├── friction.py     # FrictionEvent model
│   └── report.py       # GhostReport with scoring, grading, cost estimation
├── reporter/
│   ├── terminal.py     # Rich terminal UI — panels, tables, progress
│   ├── html.py         # Jinja2 HTML report generator
│   └── templates/
│       └── report.html # Self-contained dark-theme HTML template
└── fixgen/
    └── suggestions.py  # AI-generated documentation fix suggestions

Engineering Highlights

Multi-provider AI with automatic fallback Anthropic → OpenRouter → OpenAI. If one fails, the next picks up. OpenRouter gives access to any model — Claude, GPT, Gemini, Llama, DeepSeek — through a single key.

Security-first execution

Docker containers are sandboxed with 2GB RAM / 2 CPU limits
AI-generated recovery commands are validated against a dangerous-pattern blocklist before execution
No curl | bash, no rm -rf /, no fork bombs — even inside the container
shlex.quote for all shell arguments

Intelligent caching Plans are cached in SQLite with a 24-hour TTL. Same repo + same model = instant results on re-run. --no-cache to bypass.

Graceful degradation No Docker? --no-docker gives full doc analysis. No API key? Interactive setup on first run. AI returns garbage JSON? Heuristic fallback classifies results without AI.

CI/CD ready --json-output for machine-readable reports. --fail-threshold 40 exits with code 2 if friction is too high. Gate your PRs on documentation quality.

141 tests, zero warnings Unit tests for every model, every heuristic, every parser. Integration tests for the CLI. Mocked AI calls for the execution pipeline. All passing.

Tech Stack

click · rich · docker · anthropic · openai · gitpython · jinja2 · pydantic · python-dotenv

Dev: pytest · pytest-cov · ruff

Requirements

Python 3.11+
Docker (or --no-docker for analysis-only)
API key: Anthropic, OpenRouter, or OpenAI (prompted on first run)

License

MIT

Built for DX-RAY 2026 👻

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.0

Mar 28, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ghost_dev-0.1.0.tar.gz (47.9 kB view details)

Uploaded Mar 28, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

ghost_dev-0.1.0-py3-none-any.whl (42.4 kB view details)

Uploaded Mar 28, 2026 Python 3

File details

Details for the file ghost_dev-0.1.0.tar.gz.

File metadata

Download URL: ghost_dev-0.1.0.tar.gz
Upload date: Mar 28, 2026
Size: 47.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.12

File hashes

Hashes for ghost_dev-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`6ea0777c9ef39b8a17b0024ae4ee2d394728d8d901b20167593a5562b5b81adc`
MD5	`98214823a6255de414ea0072e7d40b93`
BLAKE2b-256	`af527ce5abaab3f9f8c50e2a29b8b50c235700c53b1e2b6901fc2c577586c3d6`

See more details on using hashes here.

File details

Details for the file ghost_dev-0.1.0-py3-none-any.whl.

File metadata

Download URL: ghost_dev-0.1.0-py3-none-any.whl
Upload date: Mar 28, 2026
Size: 42.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.12

File hashes

Hashes for ghost_dev-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`f7450d9c2a96ebc398b5e0647be4f0d55509b3cd96e7ab0047d95afd0e98591e`
MD5	`63c3d8897359eb1c5c2ee50732c4f8d6`
BLAKE2b-256	`713bf6a11ff3661151daf90a393fd5064ae1b3f8dcb3c89901c527e5b07e3d8b`

See more details on using hashes here.

ghost-dev 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

ghost.dev

How It Works

Quick Start

Usage

Other Commands

Output

The Ghost Philosophy

Architecture

Engineering Highlights

Tech Stack

Requirements

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes