Skip to main content

Multi-model deliberation engine for security code review. Empanels a jury of LLMs, returns a hash-chained verdict.

Project description

GhostJury

Multi-model deliberation engine for security code review.

Submit code. GhostJury empanels a jury of 5 independent LLMs, each returns a binary verdict with reasoning, and the aggregated result is hash-chained so votes and appeals bind to the exact analysis that produced them.

         ┌─────────────┐
         │  Submission │
         └──────┬──────┘
                ▼
         ┌─────────────┐
         │    Triage   │  Haiku 4.5 — "worth empaneling a jury?"
         └──────┬──────┘
                ▼
    ┌───────┬───┴───┬────────┬───────┐
    ▼       ▼       ▼        ▼       ▼
 Sonnet  Opus   GPT-5.2  o4-mini  Grok 4
    │       │       │        │       │
    └───────┴───┬───┴────────┴───────┘
                ▼
         ┌─────────────┐
         │  Synthesis  │  rule-based aggregation
         └──────┬──────┘
                ▼
         ┌─────────────┐
         │   Verdict   │  APPROVED │ REVIEW │ REJECTED
         └─────────────┘

Cost per audit (verified 2026-04-14)

Per-audit assumptions: ~3,500 input tokens / ~200 output tokens per juror, plus a Haiku triage call. Prompt caching is enabled by default on Claude jurors — the ~800-token system prompt becomes a cache hit after the first call, so subsequent jurors and the triage all read it at a 90% discount.

Juror Model $/M in $/M out Per call (cached)
1 claude-sonnet-4-6 $3.00 $15.00 ~$0.011
2 claude-opus-4-6 $15.00 $75.00 ~$0.054
3 gpt-5.2 $5.00* $20.00* ~$0.022
4 o4-mini $3.00* $12.00* ~$0.022
5 grok-4.20-0309-reasoning $2.00 $6.00 ~$0.008
Triage claude-haiku-4-5 $0.80 $4.00 ~$0.001
Per audit ~$0.118 (cached path)

* Estimate, see pricing.py for confidence tier and source.

At volume, with a Haiku triage that skips ~90% of trivial submissions:

Audits/day Cost/day Cost/month
100 ~$1.20 ~$36
1,000 ~$12 ~$360
10,000 ~$120 ~$3,600

Without the triage gate the numbers are roughly 10× higher — the gate is the primary cost lever.

Voting rule

Each juror returns { malicious: bool, reason: str } on a submission. GhostJury aggregates:

Yes votes Tier Action
0 APPROVED auto-approve
1 REVIEW human review queue — dissenting juror's reason is visible
2+ REJECTED auto-reject — appeal allowed

Fail-closed posture. Any juror flag bumps the submission out of auto-approve.

Hash chain

Every verdict carries a verdict_hash derived from:

sha256(
  submission_sha      ||
  triage_result_json  ||
  jurors_json         ||  # all 5 juror verdicts, in roster order
  synthesis_json      ||
  roster_json            # model versions + juror order
)

This matters because when users sign votes on the public page, their signature binds to the specific analysis, not the submission id. Editing the analysis invalidates every existing signature — which is the point.

Install

pip install ghostjury

Quick start

No network — safe demo path:

ghostjury audit path/to/script.py --mock

Real jury (requires API keys):

export ANTHROPIC_API_KEY=sk-ant-...
export OPENAI_API_KEY=sk-...
export XAI_API_KEY=xai-...

ghostjury audit path/to/suspicious_script.py

The CLI refuses to start a real jury with partial keys — it checks every provider in the default roster first and lists exactly which env vars are missing. No half-juries, no wasted tokens.

Architecture

  • schema.py — Pydantic models for every artifact
  • hashing.py — content-hash computation
  • prompts.py — triage + juror system prompts
  • pricing.py — per-model cost estimates (audit and update before production)
  • router.py — backend abstraction: MockBackend for tests, HttpxBackend for real calls
  • triage.py — Haiku 4.5 gate
  • jury.py — juror orchestration (currently sequential; becomes parallel once ghostrouter ships execute_parallel)
  • synthesis.py — rule-based verdict aggregation
  • cli.py — command-line entry

License

Apache 2.0.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ghostjury-0.2.0.tar.gz (36.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ghostjury-0.2.0-py3-none-any.whl (29.8 kB view details)

Uploaded Python 3

File details

Details for the file ghostjury-0.2.0.tar.gz.

File metadata

  • Download URL: ghostjury-0.2.0.tar.gz
  • Upload date:
  • Size: 36.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for ghostjury-0.2.0.tar.gz
Algorithm Hash digest
SHA256 c27e1fb075b3637490c149be6776be72706f6200a6aa36ada842daa658d71a97
MD5 cc06b2352d6c875bc8870cf80b07938b
BLAKE2b-256 23bbab16638c021823db220a46fcc3f1f86901a7eb6ba828f7438b2d1bea8e4a

See more details on using hashes here.

File details

Details for the file ghostjury-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: ghostjury-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 29.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for ghostjury-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 9ab077ea3dc17dcf5496ef00d57cbd436f1a8ba887b39bac90d947b6556d2029
MD5 b4f668afc959445066fd258f61ed3750
BLAKE2b-256 8239df32ff6b594fa971b364ee46db16fc88751a1822a4bf727ad3a9055d3aa9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page