Skip to main content

VERDICT WEIGHT™ — Context-Adaptive Multi-Source Confidence Synthesis. Patent Pending #64/032,606. 673 tests, 1,270,000+ scenarios validated. 57.8% adversarial suppression. 12 domain profiles.

Project description

VERDICT WEIGHT™

A Context-Adaptive Multi-Source Confidence Synthesis Framework for Autonomous AI Intelligence Systems

SSRN DOI USPTO Patent Pending PyPI Tests License Python

"Calibrated multi-source confidence scoring is not an optional feature of autonomous AI systems — it is a foundational architectural requirement."


The Problem

Autonomous AI systems treat all intelligence sources as equal. A rumor on a threat forum and a Mandiant primary incident report receive the same weight. A 30-day-old signal and a real-time alert are scored identically. A spoofed high-credibility source sails straight through.

That is not a UI problem. That is a systematic architectural vulnerability.


Validated Results

Validated across N=10,000 synthetic scenarios (seed=42, fully reproducible). Dataset SHA-256: 40bc6e227e30f5292796b3c8df60c68a8339180eea4e2379f1ab9d1e5ac8bd63

Method Brier ↓ 95% CI AUC-ROC McNemar
VERDICT WEIGHT™ 0.2079 [0.2036, 0.2122] 0.7499
Equal Weight 0.2170 [0.2130, 0.2210] 0.7450 p<0.001 ***
Single Source 0.2499 [0.2447, 0.2553] 0.6537 p<0.001 ***
Two Stream 0.2298 [0.2251, 0.2346] 0.7258 p<0.001 ***

Key results:

  • 57.8% suppression of adversarial spoofed intelligence vs single-source baselines
  • +16.8% Brier Score improvement vs single source (p<0.001 ***)
  • +14.7% AUC-ROC improvement vs single source (p<0.001 ***)
  • 5-fold CV stability: Brier 0.2079 ± 0.0046 — results do not overfit
  • All significance tests: McNemar p=0.000000 against all baselines

Cross-Vertical Performance (N=1,000 per vertical)

Vertical Brier Δ% AUC Δ% Adv. Suppression
Cybersecurity +17.7% +17.0% 57.4%
Healthcare +9.6% +20.9%
Financial +10.3% +15.1%
Manufacturing +9.2% +8.5% 40.5%
Legal +4.4% +3.8% 30.7%
Defense -2.5% +0.3% 16.6%
Enterprise RAG -6.7% -1.1% 25.7%

Defense and RAG show negative Brier improvement due to synthetic data characteristics. See audit report for full failure mode analysis.

Head-to-Head vs Established Fusion Methods

Method Brier ↓ AUC-ROC Adv Score ↓ Range ↑
VERDICT WEIGHT™ 0.2017 0.7431 0.3796 0.4797
Simple Averaging 0.2182 0.7186 0.7109 0.1625
Max Voting 0.3020 0.6683 0.9369 0.0122
Dempster-Shafer 0.2429 0.7186 0.8610 0.1057
Naive Bayes Fusion 0.3359 0.7112 1.0000 0.0000

Under conflicting evidence (high SR, low CC), VERDICT WEIGHT scores 0.2250. Dempster-Shafer scores 0.8468. All comparisons p<0.001 (Mann-Whitney U).

Real-World Validation (120 CVEs)

Validated on 120 real CVEs from NIST NVD and CISA Known Exploited Vulnerabilities catalog.

  • AUC=1.0000 on real exploitation prediction
  • Brier 2.2× lower than simple averaging (0.0749 vs 0.1655)
  • 10/10 highest VW scores are confirmed CISA KEV exploits
  • 10/10 lowest VW scores are safe CVEs
  • Log4Shell (CVE-2021-44228) correctly scored CRITICAL at day 12

What VERDICT WEIGHT™ Does

Eight evidence streams → Three outputs → One decision.

Eight Evidence Streams

Stream Symbol Description
1. Source Reliability SR Credibility of the originating source (0.01–0.99)
2. Cross-Feed Corroboration CC Independent confirmation across feeds
3. Temporal Decay TD Recency of the intelligence signal
4. Historical Source Accuracy HA Empirical track record of the source
5. Cross-Temporal Consistency CTC Trajectory analysis — detects fabricated signals
6. Source Independence SIS Verifies genuine organizational independence (anti-Curveball)
7. Cryptographic Provenance CPS Hash chain integrity — detects forged histories
8. Registry Integrity RIS Gate — halts scoring if registry is compromised

Three Output Components

Output Symbol Range Description
Signal Strength SS 0–1 Confidence the signal is real
Doubt Index DI 0–1 Inter-stream disagreement
Consequence Weight CW 0–1 Actionability after doubt adjustment

Twelve Context Profiles

Domain Profile Type
Cybersecurity (General) Corroboration-dominant
Cybersecurity (APT) Source reliability elevated, slow decay
Cybersecurity (Zero-Day) Temporal decay dominant
Cybersecurity (Disinformation) Maximum corroboration + doubt penalty
Healthcare (Diagnostic) High doubt penalty, surfaces uncertainty
Healthcare (Drug Safety) Highest doubt penalty in registry
Financial (Fraud) Corroboration + recency co-dominant
Financial (Market) Temporal decay dominant, fast decay
Defense Intelligence Multi-source fusion, slow strategic decay
Autonomous Vehicle Sub-second decay, highest doubt penalty
Legal Evidence Minimal decay, chain of custody weighted
Enterprise RAG LLM retrieval confidence scoring

Quick Start

pip install verdict-weight
from verdict_weight import VerdictWeight, ContextType

vw = VerdictWeight()

# Score a cybersecurity threat intelligence signal
result = vw.score(
    source_reliability=0.92,        # How credible is this source?
    n_corroborating_sources=3,       # How many independent sources confirm?
    age_value=2.5,                   # How old is this intelligence (days)?
    correct_predictions=45,          # Source's historical correct calls
    total_predictions=50,            # Source's total historical calls
    context=ContextType.CYBERSECURITY_APT
)

print(result.action_tier)           # CRITICAL
print(result.consequence_weight)    # 0.8380
print(result.doubt_index)           # 0.0691
print(result.interpretation)        # "Act immediately. High-confidence..."
print(result.to_json())             # Full JSON output

Adversarial signal — watch the suppression

# High-credibility source but zero corroboration — spoofed intel
result = vw.score(
    source_reliability=0.95,         # Looks credible
    n_corroborating_sources=0,        # Nobody else is confirming this
    age_value=1.0,
    context=ContextType.CYBERSECURITY_DISINFO
)

print(result.action_tier)           # NOISE
print(result.consequence_weight)    # 0.147 — suppressed by 84%

Score pre-computed streams directly

result = vw.score_streams(
    SR=0.92, CC=0.78, TD=0.94, HA=0.88,
    context=ContextType.FINANCIAL_FRAUD
)

Repository Structure

verdict-weight/
├── verdict_weight/
│   ├── __init__.py          # Public API
│   └── core.py              # VerdictWeight engine — all 12 profiles
├── validation/
│   ├── synthetic_validation.py   # N=10,000 validation (seed=42)
│   └── ablation_study.py         # 324-config weight ablation
├── examples/
│   ├── cybersecurity.py
│   └── healthcare.py
├── docs/
│   └── VERDICT_WEIGHT_Paper.pdf  # SSRN 6532658
├── pyproject.toml
├── requirements.txt
└── README.md

Mathematical Foundation

Signal Strength (weighted geometric mean):

SS = ∏(S_i + ε)^w_i   where Σw_i = 1.0

Doubt Index (normalized coefficient of variation):

DI = clip(σ(SR,CC,TD,HA) / μ(SR,CC,TD,HA), 0, 1)

Consequence Weight:

CW = clip(SS × (1 - δ × DI), 0, 1)

The geometric mean is chosen because it penalizes weak streams multiplicatively. A single stream near zero collapses the score — preventing one strong source from masking fundamental evidence gaps. This is the structural guarantee behind the 57.8% adversarial suppression result.


Citation

@misc{byrd2026verdictweight,
  title={VERDICT WEIGHT: A Context-Adaptive Multi-Source Confidence Synthesis
         Framework for Autonomous AI Intelligence Systems},
  author={Byrd, Andre},
  year={2026},
  howpublished={SSRN Preprint},
  note={SSRN Abstract ID: 6532658},
  url={https://papers.ssrn.com/sol3/papers.cfm?abstract_id=6532658},
  doi={10.5281/zenodo.19447547}
}

Reproducibility

Results are fully reproducible:

  1. Clone this repository
  2. Run python validation/synthetic_validation.py
  3. Verify SHA-256 matches: 40bc6e227e30f5292796b3c8df60c68a8339180eea4e2379f1ab9d1e5ac8bd63

Master seed: 42 (never changes — all results are deterministic)

Test coverage: 673 tests passing across 27 suites, validated against 1,270,000+ scenarios including Monte Carlo stress testing, adversarial optimization attacks, property-based blind testing, statistical robustness across 100 independent random seeds, formal verification over 973,000 exhaustive inputs, head-to-head comparison against Dempster-Shafer/Naive Bayes/averaging/max-voting, and real-world validation on 120 CVEs from NIST NVD and CISA KEV.


Legal

VERDICT WEIGHT™ is a trademark of Six Sense Enterprise Services LLC (Odingard Security). USPTO Serial Number: 99747827. Patent Pending — Application #64/032,606.

© 2026 Six Sense Enterprise Services LLC. All rights reserved.

This software is made available for research and evaluation purposes. Commercial deployment requires a license agreement.

For licensing: andre.byrd@odingard.com Paper: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=6532658 DOI: https://doi.org/10.5281/zenodo.19447547

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

verdict_weight-1.1.0.tar.gz (16.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

verdict_weight-1.1.0-py3-none-any.whl (13.7 kB view details)

Uploaded Python 3

File details

Details for the file verdict_weight-1.1.0.tar.gz.

File metadata

  • Download URL: verdict_weight-1.1.0.tar.gz
  • Upload date:
  • Size: 16.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for verdict_weight-1.1.0.tar.gz
Algorithm Hash digest
SHA256 b3e9ed546b5d9e8f11a7864cf7e891ce924885ee01f759964f447eab9a043af5
MD5 d1f63a357a3fda79b4cbb429a43656cb
BLAKE2b-256 a78fe7add23209b776089edd2e5b7d2c031cc53ca3e173f951f53e7d108db1c2

See more details on using hashes here.

File details

Details for the file verdict_weight-1.1.0-py3-none-any.whl.

File metadata

  • Download URL: verdict_weight-1.1.0-py3-none-any.whl
  • Upload date:
  • Size: 13.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for verdict_weight-1.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 646a8f577effe6b4afb0ad4bcab85bc49d4c99c88c858924b1973bf4ce1aa2f8
MD5 34220f208a5cc53ebf2e7520d8d98fa8
BLAKE2b-256 3f314f8d8ef832050dd1eca92ff6d261018c1cc7f84e312544902313a0d9969e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page