Skip to main content

Multi-model deliberation engine for security code review. Empanels a jury of LLMs, returns a hash-chained verdict.

Project description

GhostJury

Multi-model deliberation engine for security code review.

Submit code. GhostJury empanels a jury of 5 independent LLMs, each returns a binary verdict with reasoning, and the aggregated result is hash-chained so votes and appeals bind to the exact analysis that produced them.

         ┌─────────────┐
         │  Submission │
         └──────┬──────┘
                ▼
         ┌─────────────┐
         │    Triage   │  Haiku 4.5 — "worth empaneling a jury?"
         └──────┬──────┘
                ▼
    ┌───────┬───┴───┬────────┬───────┐
    ▼       ▼       ▼        ▼       ▼
 Sonnet  Opus   GPT-5.2  o4-mini  Grok 4
    │       │       │        │       │
    └───────┴───┬───┴────────┴───────┘
                ▼
         ┌─────────────┐
         │  Synthesis  │  rule-based aggregation
         └──────┬──────┘
                ▼
         ┌─────────────┐
         │   Verdict   │  APPROVED │ REVIEW │ REJECTED
         └─────────────┘

Cost per audit (verified 2026-04-14)

Per-audit assumptions: ~3,500 input tokens / ~200 output tokens per juror, plus a Haiku triage call. Prompt caching is enabled by default on Claude jurors — the ~800-token system prompt becomes a cache hit after the first call, so subsequent jurors and the triage all read it at a 90% discount.

Juror Model $/M in $/M out Per call (cached)
1 claude-sonnet-4-6 $3.00 $15.00 ~$0.011
2 claude-opus-4-6 $15.00 $75.00 ~$0.054
3 gpt-5.2 $5.00* $20.00* ~$0.022
4 o4-mini $3.00* $12.00* ~$0.022
5 grok-4.20-0309-reasoning $2.00 $6.00 ~$0.008
Triage claude-haiku-4-5 $0.80 $4.00 ~$0.001
Per audit ~$0.118 (cached path)

* Estimate, see pricing.py for confidence tier and source.

At volume, with a Haiku triage that skips ~90% of trivial submissions:

Audits/day Cost/day Cost/month
100 ~$1.20 ~$36
1,000 ~$12 ~$360
10,000 ~$120 ~$3,600

Without the triage gate the numbers are roughly 10× higher — the gate is the primary cost lever.

Voting rule

Each juror returns { malicious: bool, reason: str } on a submission. GhostJury aggregates:

Yes votes Tier Action
0 APPROVED auto-approve
1 REVIEW human review queue — dissenting juror's reason is visible
2+ REJECTED auto-reject — appeal allowed

Fail-closed posture. Any juror flag bumps the submission out of auto-approve.

Hash chain

Every verdict carries a verdict_hash derived from:

sha256(
  submission_sha      ||
  triage_result_json  ||
  jurors_json         ||  # all 5 juror verdicts, in roster order
  synthesis_json      ||
  roster_json            # model versions + juror order
)

This matters because when users sign votes on the public page, their signature binds to the specific analysis, not the submission id. Editing the analysis invalidates every existing signature — which is the point.

Install

pip install ghostjury

Quick start

No network — safe demo path:

ghostjury audit path/to/script.py --mock

Real jury (requires API keys):

export ANTHROPIC_API_KEY=sk-ant-...
export OPENAI_API_KEY=sk-...
export XAI_API_KEY=xai-...

ghostjury audit path/to/suspicious_script.py

The CLI refuses to start a real jury with partial keys — it checks every provider in the default roster first and lists exactly which env vars are missing. No half-juries, no wasted tokens.

Architecture

  • schema.py — Pydantic models for every artifact
  • hashing.py — content-hash computation
  • prompts.py — triage + juror system prompts
  • pricing.py — per-model cost estimates (audit and update before production)
  • router.py — backend abstraction: MockBackend for tests, HttpxBackend for real calls
  • triage.py — Haiku 4.5 gate
  • jury.py — juror orchestration (currently sequential; becomes parallel once ghostrouter ships execute_parallel)
  • synthesis.py — rule-based verdict aggregation
  • cli.py — command-line entry

License

Apache 2.0.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ghostjury-0.3.0.tar.gz (36.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ghostjury-0.3.0-py3-none-any.whl (30.1 kB view details)

Uploaded Python 3

File details

Details for the file ghostjury-0.3.0.tar.gz.

File metadata

  • Download URL: ghostjury-0.3.0.tar.gz
  • Upload date:
  • Size: 36.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for ghostjury-0.3.0.tar.gz
Algorithm Hash digest
SHA256 357928dd9a7de83ccb93f6a4b9ba610d53f9e74b2d89ebadb92fdabed5272f8a
MD5 f01618e6d7bc5d26abe069872ec5755b
BLAKE2b-256 840b18909808e555fc50dc78332df55a3ea9ddd3c8c3bc7a95f71ba9165119e2

See more details on using hashes here.

File details

Details for the file ghostjury-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: ghostjury-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 30.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for ghostjury-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 a0a77350a250cc3ec7a68337859de20d0549fcf05f6ea3c701e8c6baada0cf68
MD5 6d7c0dffd5d0a6011a48b18def0ef581
BLAKE2b-256 234eebb860903d99e0df7ab1bdbe08688ad6703fafa9f7b326db2a369b3b88fb

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page