CLI for creating, validating, and signing Unit of Assurance evidence packages for computational modeling and simulation credibility
Project description
Unit of Assurance (UofA) — v0.5.2
The Unit of Assurance is the smallest independently verifiable bundle of credibility evidence for computational modeling and simulation (CM&S). It packages the credibility decision — who judged what, against what criteria, using what evidence, with what result — as a signed, provenance-linked, machine-verifiable engineering artifact.
Conference Attendees: Run the 30-Second Demo
pip install uofa
uofa demo
The bundled fixture exercises the full C1 (signature + integrity) + C2 (SHACL) + C3 (Jena rule engine) pipeline against a small pre-computed UofA artifact — no Java install, no LLM runtime, no internet required. Use it to verify "yes, this tool actually does what the speaker claimed" in under a minute.
When you're ready to encode your own evidence, see the Quick Start below.
Quick Start: Create Your Own UofA
No install required — click the button above to open a ready-to-use environment with Python, Java, and the uofa CLI pre-installed.
Or run locally:
# 1. Install the uofa CLI (one command — bundles Python deps, the rule
# engine JAR, and an OpenJDK 17 JRE inside the wheel; no Java or Maven
# install required).
pip install uofa
# 2. Import from Excel (fastest on-ramp for practitioners)
uofa import my-assessment.xlsx --sign --key keys/research.key --check
# — OR — scaffold from a JSON-LD template
uofa init my-project
# Edit my-project/my-project-cou1.jsonld — fill in your project details
uofa sign my-project/my-project-cou1.jsonld --key my-project/keys/my-project.key
uofa check my-project/my-project-cou1.jsonld
Platform wheels are published for macOS (arm64 + x86_64), Linux (x86_64 +
aarch64), and Windows (x86_64). The uofa[extract] extra adds the LLM-
backed prose-to-UofA pipeline; see uofa setup --help for one-time runtime
installation.
New to UofA? See the Onboarding Guide for a step-by-step walkthrough, or study the Morrison demo below.
Why UofA?
UofA exists because the credibility frameworks are not the problem. ASME V&V 40, NASA-STD-7009B, and the FDA's 2023 guidance on CM&S credibility provide clear instructions for how to assess simulation credibility. The problem is the last mile: there is no standardized construct for packaging, transmitting, and verifying the evidence and decisions those assessments produce.
The result is predictable. Credibility decisions live in prose PDFs. Evidence is scattered across tools. Provenance is partial. Audit packaging is manual. And reviewers catch quality gaps by intuition rather than automation.
UofA addresses this through three contributions:
| Contribution | What it does | Mechanism |
|---|---|---|
| C1 — Decision as artifact | Captures the credibility decision as a portable, tool-independent object with provenance lineage and integrity guarantees | JSON-LD + PROV-DM + SHA-256 hash + ed25519 digital signatures |
| C2 — Completeness enforcement | Defines what a UofA must contain at each rigor level and enforces it as a computable constraint | SHACL profiles (Minimal / Complete) with format-validated integrity fields |
| C3 — Quality gates | Detects substantive credibility gaps — missing UQ, orphan claims, acceptance criteria gaps — including compound risks that no individual query can find | Jena forward-chaining rule engine with compound inference |
Live Demo: Morrison Blood Pump (FDA V&V 40 Case Study)
The packs/vv40/examples/morrison/ directory contains complete, working UofA evidence packages built from Morrison et al. (2019) — an FDA OSEL co-authored V&V 40 credibility assessment for a centrifugal blood pump. This is the most widely cited V&V 40 worked example.
What the demo shows:
Morrison prose assessment → UofA structured evidence package
"model deemed credible" JSON-LD with 13 V&V 40 factors,
scattered across 10 pages provenance chain, integrity hash,
of journal article machine-verifiable in 30 seconds
Run it yourself:
pip install uofa # bundles the rule engine JAR + an OpenJDK 17 JRE
# Run the full C1 + C2 + C3 pipeline in one command
uofa check packs/vv40/examples/morrison/cou1/uofa-morrison-cou1.jsonld
That single command runs three checks:
| Step | Command | What it does |
|---|---|---|
| C2 | uofa shacl FILE |
SHACL Complete profile validation — all required fields present |
| C1 | uofa verify FILE |
SHA-256 hash + ed25519 signature verification — content untampered |
| C3 | uofa rules FILE |
Jena rule engine — 23 forward-chaining rules (21 core + 2 compound) detect quality gaps |
The bundled JAR + JRE inside the wheel mean no Maven, no separate Java
install, and no --build flag is needed. Source-tree contributors can
still build the JAR via cd src/weakener-engine && mvn package and run from
their own checkout — the bundled JRE only activates inside an installed
wheel.
What the rule engine finds on Morrison COU1 at v0.5.2 (24 weakeners across 9 patterns):
| Pattern | Severity | Hits | What it detects |
|---|---|---|---|
| W-EP-01 | Critical | 1 | Orphan claim — no evidence chain to supporting data |
| W-EP-02 | High | 3 | Broken provenance — validation results with no generation activity |
| W-AL-01 | High | 3 | Missing uncertainty quantification on validation results |
| W-AR-05 | High | 3 | Comparator absence — results not linked to reference entities |
| W-CON-01 | High | 6 | Accepted decision with factors lacking both requiredLevel and achievedLevel |
| W-CON-04 | Medium | 1 | Complete profile with no sensitivity analysis linked |
| W-ON-02 | High | 1 | COU lacks both applicability constraint and operating envelope |
| ⚡ COMPOUND-01 | Critical | 5 | Risk escalation — Critical + High weakeners coexist on same UofA |
| ⚡ COMPOUND-03 | High | 1 | Assurance level override — declared "Medium" but Critical gaps exist |
The v0.5.2 catalog includes 23 core weakener patterns spanning epistemic, aleatoric, ontological, structural, consistency, provenance, and argumentation categories. Run uofa catalog to list the full set. The Morrison COU1 example fires 9 of those 23 (7 Level-1 rules plus 2 compound rules).
The ⚡ compound rules fire on the output of the core rules — this is chained forward-chaining inference that standalone SPARQL queries cannot produce. Same model, same data, same rules: the rule engine reasons about the interactions between gaps, not just the gaps themselves.
COU Divergence: uofa diff
Morrison contains two Contexts of Use assessing the same CFD model:
- COU1 (CPB, Class II, Model Risk Level 2) → Decision: Accepted
- COU2 (VAD, Class III, Model Risk Level 5) → Decision: Not accepted
Same model. Same experimental data. Different credibility requirements driven by different model risk. The uofa diff command surfaces this divergence automatically:
uofa diff packs/vv40/examples/morrison/cou1/uofa-morrison-cou1.jsonld \
packs/vv40/examples/morrison/cou2/uofa-morrison-cou2.jsonld
════════════════════════════════════════════════════════
COU Divergence Analysis
════════════════════════════════════════════════════════
COU A COU B
Name COU1: Cardiopulmonary bypass use (Class II) COU2: Ventricular assist device use (Class III)
Device class Class II Class III
Model risk level MRL 2 MRL 5
Decision Accepted Not accepted
Assurance level Medium Low
Weakeners 9 6
══ Weakener Patterns (10) ══
┌────────────────────────────────────────────────────────────────┐
│ Pattern │ Severity │ COU A │ COU B │ Status │
├──────────────┼────────────┼─────────┼─────────┼──────────────┤
│ W-AL-01 │ [High] │ ✓ │ ✗ │ ◆ divergent │
│ W-AL-02 │ [Medium] │ ✗ │ ✓ │ ◆ divergent │
│ W-AR-05 │ [High] │ ✓ │ ✗ │ ◆ divergent │
│ W-CON-01 │ [High] │ ✓ │ ✗ │ ◆ divergent │
│ W-CON-04 │ [Medium] │ ✓ │ ✓ │ same │
│ W-EP-01 │ [Critical] │ ✓ │ ✗ │ ◆ divergent │
│ W-EP-02 │ [High] │ ✓ │ ✗ │ ◆ divergent │
│ W-EP-04 │ [High] │ ✗ │ ✓ │ ◆ divergent │
│ W-ON-02 │ [High] │ ✓ │ ✓ │ same │
│ W-PROV-01 │ [Critical] │ ✗ │ ✓ │ ◆ divergent │
└──────────────┴────────────┴─────────┴─────────┴──────────────┘
══ Compound Patterns (2) ══
┌────────────────────────────────────────────────────────────────┐
│ Pattern │ Severity │ COU A │ COU B │ Status │
├──────────────┼────────────┼─────────┼─────────┼──────────────┤
│ COMPOUND-01 │ [Critical] │ ✓ │ ✓ │ same │
│ COMPOUND-03 │ [High] │ ✓ │ ✗ │ ◆ divergent │
└──────────────┴────────────┴─────────┴─────────┴──────────────┘
══ Summary ══
COU A (COU1: Cardiopulmonary bypass use (Class II)):
[Critical] 2
[High] 6
[Medium] 1
COU B (COU2: Ventricular assist device use (Class III)):
[Critical] 2
[High] 2
[Medium] 2
9 divergence(s) detected
══ Divergence Explanations ══
[High] COMPOUND-03 — only in COU A
COU1: Cardiopulmonary bypass use (Class II): Assurance level is not Low, yet Critical weakeners exist — stated assurance level may be overstated.
COU2: Ventricular assist device use (Class III): pattern does not fire.
[High] W-AL-01 — only in COU A
COU1: Cardiopulmonary bypass use (Class II): Validation result has no uncertainty quantification — aleatory uncertainty is uncharacterized.
COU2: Ventricular assist device use (Class III): pattern does not fire.
[High] W-AR-05 — only in COU A
COU1: Cardiopulmonary bypass use (Class II): Validation result has no comparedAgainst link — comparator data source is absent.
COU2: Ventricular assist device use (Class III): pattern does not fire.
[High] W-CON-01 — only in COU A
COU1: Cardiopulmonary bypass use (Class II): Decision is Accepted but a credibility factor has neither requiredLevel nor achievedLevel — the acceptance rests on an unestablished factor.
COU2: Ventricular assist device use (Class III): pattern does not fire.
[Critical] W-EP-01 — only in COU A
COU1: Cardiopulmonary bypass use (Class II): Claim has no prov:wasDerivedFrom link to evidence — provenance chain is broken.
COU2: Ventricular assist device use (Class III): pattern does not fire.
[High] W-EP-02 — only in COU A
COU1: Cardiopulmonary bypass use (Class II): Validation result has no prov:wasGeneratedBy — generation activity is missing.
COU2: Ventricular assist device use (Class III): pattern does not fire.
[Medium] W-AL-02 — only in COU B
COU2: Ventricular assist device use (Class III): Uncertainty quantification is reported but no sensitivity analysis is linked — the drivers of uncertainty are undocumented.
COU1: Cardiopulmonary bypass use (Class II): pattern does not fire.
[High] W-EP-04 — only in COU B
COU2: Ventricular assist device use (Class III): Credibility factor is not assessed but model risk level exceeds 2 — unassessed factors at elevated risk weaken the credibility argument.
COU1: Cardiopulmonary bypass use (Class II): pattern does not fire.
[Critical] W-PROV-01 — only in COU B
COU2: Ventricular assist device use (Class III): Provenance chain terminates at a node that has no upstream derivation/generation/use edge and is not marked uofa:isFoundationalEvidence=true — chain is incomplete.
COU1: Cardiopulmonary bypass use (Class II): pattern does not fire.
The output has four sections: identity block (side-by-side COU metadata), weakener profile table (✓/✗ presence with divergence markers), summary counts (per-severity breakdown), and divergence explanations (from the description field on each WeakenerAnnotation — generated by the rule engine, not hardcoded in the diff command).
Compound patterns (COMPOUND-*) are separated into their own sub-table when present, since they fire on the output of Level 1 rules.
This divergence is invisible in the prose paper. It becomes machine-visible in the UofA. That's C1: the credibility decision — not just the evidence — captured as a first-class artifact.
Live Demo: HPT Blade CHT (NASA-STD-7009B, Aerospace)
The packs/nasa-7009b/examples/aerospace/ directory contains a parallel NASA-STD-7009B case study — an HPT turbine-blade conjugate heat transfer CFD model assessed for two operating points:
- COU1 (take-off transient, MRL 3) → Decision: Accepted with conditions
- COU2 (cruise steady-state, MRL 4) → Decision: Not accepted
Same CFD model, same cascade-rig validation data, re-purposed for a different operating regime — reproducing the Morrison divergence mechanism in aerospace. The bundles ship as zipped evidence folders (10 docs each — narrative DOCX, CFX solver settings, cascade CSVs, board minutes, decision rationale PDFs), so you can exercise the full extract → import → rules pipeline end-to-end on real input.
End-to-end roundtrip on COU1:
# 1. Extract: LLM reads 10 evidence documents, produces a pre-filled 19-factor xlsx
uofa extract tests/fixtures/extract/aero-evidence-cou1 \
--pack nasa-7009b --model ollama/qwen3.5:4b -o /tmp/aero-cou1.xlsx
# 2. Import: convert the xlsx to signed JSON-LD
uofa import /tmp/aero-cou1.xlsx --pack nasa-7009b -o /tmp/aero-cou1.jsonld
# 3. Rules: run the Jena weakener engine, write the reasoned jsonld
uofa rules /tmp/aero-cou1.jsonld --pack nasa-7009b \
--format jsonld -o /tmp/aero-cou1-reasoned.jsonld --build
The pack ships pre-computed reasoned outputs so you can skip to the interesting part:
# COU1 (Accepted) — W-AR-02 fires on narrative-stated level gaps
uofa rules packs/nasa-7009b/examples/aerospace/uofa-aero-cou1-nasa7009b.jsonld --pack nasa-7009b
# COU2 (Not Accepted) — W-AR-02 stays at zero despite 4+ not-assessed factors
uofa rules packs/nasa-7009b/examples/aerospace/uofa-aero-cou2-nasa7009b.jsonld --pack nasa-7009b
The divergence:
| Pattern | COU1 (Accepted) | COU2 (Not Accepted) |
|---|---|---|
| W-AR-02 (accept-despite-gap) | 4 fires on level gaps | 0 fires (hard gate) |
| W-EP-04 (not-assessed at MRL>2) | 1 | 4 |
| COMPOUND-01 (Critical + High) | 6 | 5 |
| W-NASA-02/03/06 (missing evidence linkage) | 1 each | 1 each |
| Total weakeners | 17 | 20 |
| Distinct patterns | 9 | 8 |
Why this matters: W-AR-02 (the rebutting-defeater rule) fires only when a decision says Accepted AND any factor has achievedLevel < requiredLevel. Flipping the decision to Not accepted disarms every instance of this rule — even though COU2 actually has more credibility gaps than COU1. That's the C3 rule engine correctly modeling the argument: a not-accepted decision has no "contradictory result ignored" to defeat. The same mechanism is visible in Morrison; here it repeats in aerospace.
Reproduce the accuracy numbers:
# Factor F1 + weakener gate scoring, logs to dev/tools/scripts/extract_accuracy_log.jsonl
python dev/tools/scripts/score_extraction.py --pack nasa-7009b --case cou1 \
--model ollama/qwen3.5:4b --prompt-version v3-nasa-aero
python dev/tools/scripts/score_extraction.py --pack nasa-7009b --case cou2 \
--model ollama/qwen3.5:4b --prompt-version v3-nasa-aero
The scorer runs extract → import → rules end-to-end and asserts gates from tests/fixtures/extract/ground_truth/aero-cou{1,2}-nasa7009b.json. The hard gate for COU2 is W-AR-02 count == 0; if it ever fires, either the extracted decision outcome isn't "Not accepted" or the rule engine is mis-matching. Most recent live run: COU1 F1 = 0.97, COU2 F1 = 0.85, both weakener gates pass.
Standards Alignment
UofA is grounded in existing standards rather than inventing new ones:
- ASME V&V 40-2018 — Credibility factors, model risk framework, and the Context of Use (COU) concept that drives per-factor assessment
- FDA 2023 Final Guidance on CM&S Credibility — Regulatory expectations for credibility evidence in medical device submissions
- NASA-STD-7009B — CMS credibility assessment standard for models and simulations
- W3C PROV-DM / PROV-O — Provenance data model for artifact lineage
- W3C SHACL — Shapes Constraint Language for RDF graph validation
- JSON-LD 1.1 — Linked data serialization that stays human-readable
Integrity Verification
Every UofA carries a real cryptographic hash and digital signature — not placeholders.
| Level | What it checks | Mechanism |
|---|---|---|
| Format gate | Hash and signature are well-formed | SHACL sh:pattern regex on both Minimal and Complete profiles |
| Content verification | Hash matches the canonical document content | uofa verify recomputes SHA-256 from JSON canonical form |
| Cryptographic signature | Document was signed by the declared authority | ed25519 signature verification against the repo public key |
# Mint a sealed UofA (sign after edits)
uofa sign packs/vv40/examples/morrison/cou1/uofa-morrison-cou1.jsonld --key keys/research.key
# Verify integrity
uofa verify packs/vv40/examples/morrison/cou1/uofa-morrison-cou1.jsonld
Placeholder strings (e.g., sha256:placeholder...) now fail SHACL validation. This is deliberate — a UofA claiming ProfileComplete must carry a real hash.
The Jena Rule Engine (C3)
Quality gap detection uses Apache Jena forward-chaining rules, not just SPARQL queries. The rule engine operates in two levels:
Level 1 — Core detection rules (21 patterns in v0.5.2) match structural patterns against the evidence graph. Categories include epistemic (W-EP-), aleatoric (W-AL-), ontological (W-ON-), structural (W-SI-), consistency (W-CON-), provenance (W-PROV-), and argumentation (W-AR-*). Run uofa catalog for the full list with descriptions.
Level 2 — Compound inference rules (2 active in v0.5.2) fire on the output of Level 1 rules:
| Rule | What it detects |
|---|---|
| COMPOUND-01 | Critical + High weakeners coexist → escalated compound risk |
| COMPOUND-03 | Declared assurance level contradicts detected Critical gaps |
COMPOUND-02 ships in the rules file but is currently commented out pending v0.6 design review; uofa catalog filters it from listing output.
The compound rules are the key differentiator versus SPARQL. They reason about the interactions between gaps — something that requires chained forward-chaining inference. As of v0.5.2, all weakener rules (including previously Python-implemented W-CON-02, W-CON-05, W-PROV-01) evaluate in a single Jena forward-chaining pass, enabling compound rules to reason over the full weakener set.
Plain-language explanations: --explain (v0.6.0)
uofa rules, check, diff, and shacl accept an --explain flag that
adds a plain-language interpretation block to the structured output. The
deterministic analysis remains the source of truth; the explanation is a
human-readable layer for regulatory affairs and validation engineers.
uofa rules my-package.jsonld --explain
uofa rules my-package.jsonld --explain --explain-max-items 3
uofa rules my-package.jsonld --explain --explain-format json
Default backend is bundled Ollama (qwen3.5:4b, local-only, free). For
higher quality or larger context, configure a remote backend in
uofa.toml or override per invocation:
uofa rules my-package.jsonld --explain \
--explain-backend anthropic \
--explain-model claude-sonnet-5-2026
# requires ANTHROPIC_API_KEY in environment
Results are cached at ~/.uofa/cache/explain.db — a second invocation
on the same input completes in <100 ms. Standalone re-interpretation of
cached output: uofa explain --from-file cache.json.
Full documentation:
- docs/explain.md — usage, output formats, caching, limitations
- docs/llm-config.md —
[llm]section, supported backends, precedence - docs/security.md — API key handling, threat model
Profiles
UofA uses a two-tier profile system. Minimal captures the bare evidence package. Complete adds the full credibility assessment.
Minimal Profile
The minimum viable UofA. Suitable for evidence capture during live pipeline execution or as a lightweight audit artifact.
| Property | Type | Purpose |
|---|---|---|
bindsRequirement |
IRI | The requirement this UofA substantiates |
hasContextOfUse |
IRI | The V&V 40 Context of Use for this assessment |
hasValidationResult |
IRI | At least one validation result |
hasDecisionRecord |
IRI | The credibility decision (accepted/rejected + rationale) |
generatedAtTime |
xsd:dateTime | When this UofA was created |
hash |
string | Content hash (format-validated: sha256:<64 hex chars>) |
signature |
string | Digital signature (format-validated: ed25519:<hex>) |
Complete Profile
Extends Minimal with full V&V 40 credibility assessment, provenance chain, and quality metrics. Required for regulatory submissions and formal credibility arguments.
Everything in Minimal, plus:
| Property | Type | Purpose |
|---|---|---|
bindsModel |
IRI | The computational model assessed |
bindsDataset |
IRI | The dataset(s) used in validation |
wasDerivedFrom |
IRI | Provenance link to parent artifact |
wasAttributedTo |
IRI | Responsible actor or organization |
hasCredibilityFactor |
CredibilityFactor[] | Per-factor assessment (V&V 40 Table 5-1) |
hasWeakener |
WeakenerAnnotation[] | (optional) Detected quality gaps |
credibilityIndex |
xsd:decimal [0–1] | Overall credibility score |
traceCompleteness |
xsd:decimal [0–1] | Provenance chain completeness |
verificationCoverage |
xsd:decimal [0–1] | Verification evidence coverage |
validationCoverage |
xsd:decimal [0–1] | Validation evidence coverage |
uncertaintyCIWidth |
xsd:decimal [≥0] | Uncertainty confidence interval width |
assuranceLevel |
string | Low / Medium / High |
criteriaSet |
IRI | Reference criteria set (e.g., ASME-VV40-2018) |
CredibilityFactor
Each factor maps to one row in V&V 40 Table 5-1 or NASA-STD-7009B:
| Property | Constraint | Purpose |
|---|---|---|
factorType |
Factor name from the active pack's taxonomy | Which credibility factor is being assessed |
factorStandard |
String (e.g., "ASME-VV40-2018", "NASA-STD-7009B") |
Which standard defines this factor |
assessmentPhase |
"capability" or "results" (NASA-STD-7009B only) |
NASA CAS assessment phase |
requiredLevel |
Integer (1–5 for V&V 40, 0–4 for NASA-7009B) | Target credibility level for this COU |
achievedLevel |
Integer (1–5 for V&V 40, 0–4 for NASA-7009B) | Actual credibility level achieved |
hasEvidence |
IRI or IRI[] (optional) | Links to backing evidence entities |
WeakenerAnnotation
Quality gap annotations detected by the Jena rule engine (C3). Optional — a UofA with zero weakeners is valid (and desirable).
| Property | Constraint | Purpose |
|---|---|---|
patternId |
Format: W-XX-NN or COMPOUND-NN |
Catalog ID from the weakener pattern taxonomy |
severity |
Critical / High / Medium / Low |
Impact severity |
affectedNode |
IRI | The specific graph node flagged by this pattern |
description |
string (optional) | Human-readable explanation of why this weakener fires |
Working with Your Own UofA
The uofa CLI provides commands for every step of the workflow:
# Extract credibility data from evidence documents with an LLM (pre-fills a pack xlsx)
uofa extract path/to/evidence/ --pack nasa-7009b --model ollama/qwen3.5:4b -o out.xlsx
# Import from a practitioner-filled Excel workbook (fastest on-ramp)
uofa import assessment.xlsx --sign --key keys/your.key --check
# Full pipeline (C1 + C2 + C3) on your file
uofa check path/to/your-uofa.jsonld
# Individual steps
uofa shacl path/to/your-uofa.jsonld # C2: SHACL validation
uofa verify path/to/your-uofa.jsonld # C1: Hash + signature check
uofa rules path/to/your-uofa.jsonld # C3: Jena weakener detection (text summary)
uofa rules FILE --format jsonld -o reasoned.jsonld # C3: write reasoned JSON-LD with weakener annotations
# Sign with your own key
uofa sign path/to/your-uofa.jsonld --key keys/your.key
# Scaffold a new project from a JSON-LD template
uofa init my-new-project
# Validate all examples in the repo
uofa validate
# Compare weakener profiles across two COUs
uofa diff uofa-cou1.jsonld uofa-cou2.jsonld
# List installed domain packs
uofa packs
# Use a specific domain pack
uofa check path/to/your-uofa.jsonld --pack vv40
# Use multiple packs (e.g., V&V 40 + NASA-STD-7009B)
uofa check path/to/your-uofa.jsonld --pack vv40 --pack nasa-7009b
# Migrate a v0.3 file to v0.4
uofa migrate path/to/old-file.jsonld
# Generate import constants from SHACL (after schema changes)
uofa schema --emit python
See the Onboarding Guide for a full walkthrough.
Adversarial Generation (research instrument)
uofa adversarial generate synthesizes JSON-LD evidence packages that target specific weakener patterns, then validates them against SHACL. The tool is an instrument for empirically characterizing rule coverage — it feeds the methodology section of Chapter 3 and the September 2026 JVVUQ paper. Synthetic packages are flagged and refused by uofa sign and uofa verify so they can never be mistaken for real evidence.
pip install -e '.[extract]' # one-time: adds litellm + pyyaml
export ANTHROPIC_API_KEY=sk-ant-... # generation defaults to claude-opus-4-7
# Generate 5 synthetic packages targeting W-AR-05 (comparator absence / mismatch)
uofa adversarial generate \
--spec dev/specs/confirm_existing/w_ar_05.yaml \
--out build/adversarial/w_ar_05/
# Dry-run: render the prompt without calling the LLM
uofa adversarial generate --spec dev/specs/confirm_existing/w_ar_05.yaml --out /tmp/dry --dry-run
# Run the full Phase 1 acceptance script
bash tests/adversarial/test_acceptance.sh
Every generated package carries an adversarialProvenance block (spec id, prompt template version, generation model, timestamp, target weakener) and a provenanceBlockHash that uofa verify recomputes to detect tampering with the synthetic flag. --strict-circularity refuses to run when the generation model matches the configured extract model; --allow-circular-model is an explicit opt-in for debugging runs.
Spec file format and the full design are documented in UofA_Adversarial_Gen_Spec_v1.1.md. Phase 1 ships the W-AR-05 (D3 undercutting) template; the registry in src/uofa_cli/adversarial/prompts/__init__.py scales to additional weakener patterns by adding keys.
Domain Packs
SHACL shapes, Jena rules, templates, and extraction prompts are organized into domain packs under packs/. The core pack ships with standards-agnostic credibility assessment rules (23 weakener patterns as of v0.5.2, up from 12 in v0.4). The vv40 pack provides the ASME V&V 40-2018 factor taxonomy (13 factors), and the nasa-7009b pack provides the NASA-STD-7009B factor taxonomy (19 factors, including 6 NASA-only lifecycle factors).
$ uofa packs
════════════════════════════════════════════════════════
Installed packs
════════════════════════════════════════════════════════
core v0.5.0 Core credibility assessment rules. Standards-agnostic. (any factors, 23 patterns) [always loaded]
nasa-7009b v0.5.0 NASA-STD-7009B credibility assessment factors (19 factors: 1... (19 factors, 6 patterns)
vv40 v0.5.0 ASME V&V 40-2018 credibility factor taxonomy (13 factors). (13 factors, 0 patterns) [active]
The --pack flag on any command switches the active pack(s). Multiple packs can be specified to combine factor taxonomies and rules. The default is --pack vv40 for backward compatibility. Per-project rules files next to the input file still take precedence over the pack default. See packs/README.md for the full pack contract and instructions for creating domain packs.
Excel Import: The Practitioner On-Ramp
Simulation engineers fill an Excel workbook, run one command, and get a signed, validated JSON-LD evidence package. The import pipeline handles URI generation, factor standard assignment, provenance tracking, and optional signing + validation in a single invocation.
pip install -e '.[excel]' # one-time: adds openpyxl dependency
# Import from Excel → JSON-LD, sign, and validate in one step
uofa import my-assessment.xlsx --sign --key keys/research.key --check --pack vv40
The Excel template has 5 sheets: Assessment Summary, Model & Data, Validation Results, Credibility Factors, and Decision. Each pack provides a pre-populated template with locked factor names and dropdown validation. See packs/vv40/templates/uofa-starter-filled.xlsx for a complete filled example.
| Feature | Detail |
|---|---|
| VV40 support | 13 V&V 40 factors, levels 1-5, factorStandard: "ASME-VV40-2018" |
| NASA-STD-7009B | 19 factors (13 shared + 6 NASA-only), levels 0-4, assessmentPhase auto-assigned |
| Evidence types | ValidationResult, ReviewActivity, ProcessAttestation, DeploymentRecord, InputPedigreeLink |
| Provenance | ImportActivity entry with timestamp, source file, and tool version |
| Error messages | Sheet name + cell reference (e.g., [Credibility Factors!C7] Required Level must be 1-5) |
| SHACL-synced | Factor names, level ranges, and enums are generated from SHACL shapes via uofa schema --emit python |
Prerequisites
Zero-install option: Open in GitHub Codespaces — everything is pre-installed.
Local install:
pip install -e '.[excel]' # installs uofa CLI + all Python deps + openpyxl for Excel import
| Tool | Version | Purpose |
|---|---|---|
| Python 3.10+ | Installed via pip install -e . |
SHACL validation + integrity verification |
| openpyxl | Installed via pip install -e '.[excel]' |
Excel import (uofa import) |
| Java 17+ | OpenJDK or equivalent | Jena rule engine (C3 only) |
| Maven 3.8+ | mvn package |
Build the Jena fat JAR (C3 only) |
Java and Maven are only required for the Jena rule engine (C3). Use uofa check FILE --skip-rules if Java is not available. openpyxl is only required for uofa import; all other commands work without it.
Architecture: One UofA per Context of Use
UofA models credibility assessment at the COU level, not the individual factor level. Each UofA packages the complete credibility decision for one Context of Use — including all per-factor assessments as embedded CredibilityFactor nodes and any detected quality gaps as WeakenerAnnotation nodes.
Morrison Blood Pump Assessment
├── morrison/cou1/uofa-morrison-cou1.jsonld (ProfileComplete)
│ COU1: CPB Use (Class II) — Model Risk Level 2
│ ├── hasContextOfUse → COU1 node
│ ├── bindsRequirement → hemolysis safety requirement
│ ├── bindsModel → ANSYS CFX v.15.0 + Eulerian HI model
│ ├── bindsDataset → [PIV data, hemolysis in vitro data]
│ ├── hasValidationResult → [mesh convergence, PIV velocity, hemolysis comparison]
│ ├── hasCredibilityFactor → [13 V&V 40 factors: 7 assessed + 6 not-assessed]
│ ├── hasWeakener → [W-EP-01, W-EP-02 (3×), W-AL-01 (3×), W-AR-05 (3×), W-CON-01 (6×), W-CON-04, W-ON-02] + [COMPOUND-01 (5×), COMPOUND-03]
│ ├── hasDecisionRecord → "Accepted for COU1"
│ ├── hash → sha256:<real hash>
│ ├── signature → ed25519:<real signature>
│ └── wasDerivedFrom → Morrison DOI
│
└── morrison/cou2/uofa-morrison-cou2.jsonld (ProfileComplete)
COU2: VAD Use (Class III) — Model Risk Level 5
├── hasCredibilityFactor → [13 V&V 40 factors: 7 assessed + 6 not-assessed]
├── hasWeakener → [W-PROV-01 (7×), W-EP-04 (6×), W-ON-02, W-AL-02, W-CON-04] + [COMPOUND-01 (2×)]
└── At MRL 5 the risk-driven catalog shifts: W-PROV-01 dominates COU2 (7 provenance-chain orphans),
W-EP-04 fires 6× on not-assessed factors, and two of W-PROV-01's Criticals coexist with High
weakeners on `cou2` — triggering 2 COMPOUND-01 cascades that were unreachable pre-v0.5.2.
Shared entities (model, datasets, pump geometry) are referenced by IRI, not duplicated. The divergence between COU1 and COU2 weakener profiles is the central analytical demonstration.
Research Context
UofA is the subject of a Doctor of Engineering praxis at George Washington University. The evaluation uses two FDA case studies:
-
Tier 1 (Retrospective): Morrison et al. (2019) — FDA generic centrifugal blood pump V&V 40 credibility assessment. Re-expressed as UofA evidence packages with real cryptographic integrity. Full 13-factor assessment (7 assessed, 6 not-assessed) with risk-driven divergence across the v0.5.2 catalog:
-
Morrison COU1 (MRL 2, Accepted): 24 weakeners including 6 Critical (1 W-EP-01 orphan claim plus 5 COMPOUND-01 cascades from coexisting Critical and High weakeners), 17 High (W-CON-01 on 6 factors with missing level assertions under the Accepted decision, plus W-AL-01, W-AR-05, W-EP-02, W-ON-02), and 1 Medium (W-CON-04 structural gap).
-
Morrison COU2 (MRL 5, Not Accepted): 18 weakeners including 9 Critical (7 W-PROV-01 provenance-chain orphans plus 2 COMPOUND-01 cascades), 7 High (6 W-EP-04 on unassessed factors at elevated model risk, 1 W-ON-02), and 2 Medium.
The cross-COU divergence (9 pattern-level divergences between COU1 and COU2) is the central analytical demonstration: same model, same data, different credibility requirements driven by different model risk produce measurably different credibility evidence profiles.
-
-
Tier 2 (Prospective): FDA VICTRE pipeline — live computational workflow instrumented to generate UofAs during execution rather than from retrospective documents.
-
Tier 3 (Exploratory): Multi-component stress test on VICTRE — simulates change events to test continuous re-issuance and hierarchical credibility composition.
Early findings — including the aerospace companion case study (HPT Blade CHT, NASA-STD-7009B) that reproduces the Morrison COU1/COU2 divergence mechanism in a turbomachinery domain — will be presented at NAFEMS Americas 2026 (May 27–29, St. Charles, MO).
Design Principles
| Principle | Meaning |
|---|---|
| Minimal | Small JSON-LD document, human-readable, one file per COU |
| Semantic | Aligns with PROV-O, V&V 40, and domain ontologies |
| Verifiable | Real SHA-256 hashes + ed25519 signatures + SHACL validation |
| Composable | UofAs form nodes in system-level assurance graphs via wasDerivedFrom |
| Tool-agnostic | Works with any simulation tool, MBSE platform, or ML pipeline |
| Hide the plumbing | Practitioners see completeness reports and gap alerts, not triples and SPARQL |
License
Apache License, Version 2.0 — see LICENSE for the full text and NOTICE for bundled-software attributions.
The full project (UofA ontology, JSON-LD context, SHACL shapes, reference
examples, Jena rule implementations, and the CLI) is licensed under
Apache 2.0. Bundled third-party components retain their own licenses
as enumerated in NOTICE (e.g., OpenJDK GPLv2-CE, Ollama MIT).
Contributing
Contributions are welcome, especially real-world UofA examples from practitioners working with CM&S credibility assessment. If you are preparing a CM&S-supported regulatory submission and want to explore UofA packaging for your evidence, please reach out.
For contributors looking to add features or fix bugs:
- Repo layout — top-level orientation; quick reference for finding code, specs, schemas, outputs, and tooling. Disambiguates
spec/(the v0.5 schema) vsdev/specs/(adversarial spec YAMLs). - Onboarding Guide — combined quick-start + architecture + contributor guide. Covers CLI design, subcommand patterns, test structure, and step-by-step instructions for adding new commands, weakener rules, and schema changes.
- Phase 2.5 tooling — the metric-gated catalog refinement loop + per-rule corpus regen tools (referenced by recent versions v0.5.7 → v0.5.15.1).
Website: crediblesimulation.com
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file uofa-0.7.0.tar.gz.
File metadata
- Download URL: uofa-0.7.0.tar.gz
- Upload date:
- Size: 4.5 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5e3b713faedf3c99d3c7a0c11c0cce393177c4dc7030fc27160829e2e4c7ce24
|
|
| MD5 |
617eb8760f9156ea159bb898625a50eb
|
|
| BLAKE2b-256 |
82910ce03bf79b1448b481afd996e2b9d02715f04fedd348198674807ea27007
|
Provenance
The following attestation bundles were made for uofa-0.7.0.tar.gz:
Publisher:
release-wheels.yml on cloudronin/uofa
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
uofa-0.7.0.tar.gz -
Subject digest:
5e3b713faedf3c99d3c7a0c11c0cce393177c4dc7030fc27160829e2e4c7ce24 - Sigstore transparency entry: 1430169021
- Sigstore integration time:
-
Permalink:
cloudronin/uofa@7902fa415fb0404b13e88609a2db8c44dca7031e -
Branch / Tag:
refs/tags/v0.7.0 - Owner: https://github.com/cloudronin
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release-wheels.yml@7902fa415fb0404b13e88609a2db8c44dca7031e -
Trigger Event:
push
-
Statement type:
File details
Details for the file uofa-0.7.0-py3-none-win_amd64.whl.
File metadata
- Download URL: uofa-0.7.0-py3-none-win_amd64.whl
- Upload date:
- Size: 63.3 MB
- Tags: Python 3, Windows x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
94ec9265fefdf98582137b823a321c601fd9f721367ff824618276459b36c57c
|
|
| MD5 |
a15e209d2afc803e982e603fb39b462b
|
|
| BLAKE2b-256 |
e811898b0dd48f63503e75c337b4d1bf40966c05017b95277d85337bc6f4eeb3
|
Provenance
The following attestation bundles were made for uofa-0.7.0-py3-none-win_amd64.whl:
Publisher:
release-wheels.yml on cloudronin/uofa
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
uofa-0.7.0-py3-none-win_amd64.whl -
Subject digest:
94ec9265fefdf98582137b823a321c601fd9f721367ff824618276459b36c57c - Sigstore transparency entry: 1430169291
- Sigstore integration time:
-
Permalink:
cloudronin/uofa@7902fa415fb0404b13e88609a2db8c44dca7031e -
Branch / Tag:
refs/tags/v0.7.0 - Owner: https://github.com/cloudronin
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release-wheels.yml@7902fa415fb0404b13e88609a2db8c44dca7031e -
Trigger Event:
push
-
Statement type:
File details
Details for the file uofa-0.7.0-py3-none-manylinux_2_28_x86_64.whl.
File metadata
- Download URL: uofa-0.7.0-py3-none-manylinux_2_28_x86_64.whl
- Upload date:
- Size: 67.2 MB
- Tags: Python 3, manylinux: glibc 2.28+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ad10027bb9c8c61ac2a2a2368f91ec4bfdc40fa10bf4ab23e482d40cebed81bf
|
|
| MD5 |
7e00a1999b63dc003595f11416c60890
|
|
| BLAKE2b-256 |
da896250d017be763560a9d809781ae792c6640630b636f93eeb2ad40feb2238
|
Provenance
The following attestation bundles were made for uofa-0.7.0-py3-none-manylinux_2_28_x86_64.whl:
Publisher:
release-wheels.yml on cloudronin/uofa
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
uofa-0.7.0-py3-none-manylinux_2_28_x86_64.whl -
Subject digest:
ad10027bb9c8c61ac2a2a2368f91ec4bfdc40fa10bf4ab23e482d40cebed81bf - Sigstore transparency entry: 1430169128
- Sigstore integration time:
-
Permalink:
cloudronin/uofa@7902fa415fb0404b13e88609a2db8c44dca7031e -
Branch / Tag:
refs/tags/v0.7.0 - Owner: https://github.com/cloudronin
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release-wheels.yml@7902fa415fb0404b13e88609a2db8c44dca7031e -
Trigger Event:
push
-
Statement type:
File details
Details for the file uofa-0.7.0-py3-none-manylinux_2_28_aarch64.whl.
File metadata
- Download URL: uofa-0.7.0-py3-none-manylinux_2_28_aarch64.whl
- Upload date:
- Size: 66.4 MB
- Tags: Python 3, manylinux: glibc 2.28+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0e5e67d19e20be1ea6e01f19913467a5adb885c970e08c300a8252d640a86c95
|
|
| MD5 |
1d1dca66f9f6b904dd60b3f901d243c2
|
|
| BLAKE2b-256 |
68d482f6d43d3c9e45260b3571b2fa0147ad13acb1d1afcaa1121b5933105a52
|
Provenance
The following attestation bundles were made for uofa-0.7.0-py3-none-manylinux_2_28_aarch64.whl:
Publisher:
release-wheels.yml on cloudronin/uofa
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
uofa-0.7.0-py3-none-manylinux_2_28_aarch64.whl -
Subject digest:
0e5e67d19e20be1ea6e01f19913467a5adb885c970e08c300a8252d640a86c95 - Sigstore transparency entry: 1430169370
- Sigstore integration time:
-
Permalink:
cloudronin/uofa@7902fa415fb0404b13e88609a2db8c44dca7031e -
Branch / Tag:
refs/tags/v0.7.0 - Owner: https://github.com/cloudronin
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release-wheels.yml@7902fa415fb0404b13e88609a2db8c44dca7031e -
Trigger Event:
push
-
Statement type:
File details
Details for the file uofa-0.7.0-py3-none-macosx_11_0_arm64.whl.
File metadata
- Download URL: uofa-0.7.0-py3-none-macosx_11_0_arm64.whl
- Upload date:
- Size: 62.9 MB
- Tags: Python 3, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b4bd48520813572dafd438a6cf684708b0ec26e1164f48fcd327fbd9bef8adb3
|
|
| MD5 |
cc0a830ede5858bd1bbc4f059de5f5fc
|
|
| BLAKE2b-256 |
116c906e883b6c8321ffdbfc38a572634fa1535f64d2f2053a9281d98d1041ab
|
Provenance
The following attestation bundles were made for uofa-0.7.0-py3-none-macosx_11_0_arm64.whl:
Publisher:
release-wheels.yml on cloudronin/uofa
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
uofa-0.7.0-py3-none-macosx_11_0_arm64.whl -
Subject digest:
b4bd48520813572dafd438a6cf684708b0ec26e1164f48fcd327fbd9bef8adb3 - Sigstore transparency entry: 1430169199
- Sigstore integration time:
-
Permalink:
cloudronin/uofa@7902fa415fb0404b13e88609a2db8c44dca7031e -
Branch / Tag:
refs/tags/v0.7.0 - Owner: https://github.com/cloudronin
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release-wheels.yml@7902fa415fb0404b13e88609a2db8c44dca7031e -
Trigger Event:
push
-
Statement type:
File details
Details for the file uofa-0.7.0-py3-none-any.whl.
File metadata
- Download URL: uofa-0.7.0-py3-none-any.whl
- Upload date:
- Size: 19.7 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e48eb3c1c5ad9237b50c9f883a410b30516e626cafafad22ddd61380005870fd
|
|
| MD5 |
d73b59afd41a0abb11bbea816a3b7fb2
|
|
| BLAKE2b-256 |
95ae8992088fc8c520613366f44967f3c7c8ac1f6baa9aecc0de97e6905a7833
|
Provenance
The following attestation bundles were made for uofa-0.7.0-py3-none-any.whl:
Publisher:
release-wheels.yml on cloudronin/uofa
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
uofa-0.7.0-py3-none-any.whl -
Subject digest:
e48eb3c1c5ad9237b50c9f883a410b30516e626cafafad22ddd61380005870fd - Sigstore transparency entry: 1430169465
- Sigstore integration time:
-
Permalink:
cloudronin/uofa@7902fa415fb0404b13e88609a2db8c44dca7031e -
Branch / Tag:
refs/tags/v0.7.0 - Owner: https://github.com/cloudronin
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release-wheels.yml@7902fa415fb0404b13e88609a2db8c44dca7031e -
Trigger Event:
push
-
Statement type: