Skip to main content

A fail-closed verification layer you wrap around your LLM: it never lets an unsupported claim through as ACCEPT, re-derives what is re-derivable, and emits a deterministic verifiable reward.

Project description

CAPAS INTELIGENTES

CAPAS CI Conformance OpenSSF Scorecard License Binder Open In Colab

Verify it yourself — free, neutral infrastructure (don't trust us, reproduce it)

A claim is only worth the badge if a third party can reproduce it. Every public claim here is checkable on infra we don't control:

  • Conformance on neutral CI, signed. GitHub Actions runs python3 benchmarks/conformance.py on every push and Sigstore-signs the result into a public transparency log (keyless). The badge above is that run — green means the load-bearing invariants hold on GitHub's runners, not just a laptop.
  • Run a claim yourself — the Binder / Colab badges open notebooks/run_a_claim.ipynb: install nothing, run the gate in your browser, get the same verdict and audit hash we publish.
  • Independent supply-chain score — the OpenSSF Scorecard badge is a third party's assessment.
  • Every number is re-derivable — see docs/CLAIMS_REGISTRY.md; each claim carries the command that reproduces it. Governance: GOVERNANCE.md.

Your turn (account-gated, one-time): connect the repo to Zenodo + cut a release → mints a citable DOI (.zenodo.json is ready); add a PyPI Trusted Publisherpip install goes live via publish.yml; push spaces/ to a Hugging Face Space → a neutral hosted demo; register at bestpractices.dev for the OpenSSF Best Practices badge.

CAPAS does not determine truth. It evaluates whether supplied evidence licenses a specific claim under a declared admissibility contract — deterministically, with no language model in the verdict, and re-derivably (same input → same verdict → same audit hash). A claim + its structured evidence goes in; ACCEPT / REWRITE / REJECT / HOLD comes out.

CAPAS ships open-core + open standard (Apache-2.0). The defensible asset is the schema, the admissibility calculus, the certificate, and the benchmark corpus — not the code. It is a reference implementation in the lineage of OPA / SPDX / RO-Crate, not a closed SaaS. The CAPAS name, certification, and official benchmark are reserved (see NOTICE).

It gates its own claims: every public number is CLOSED (proven by a test), BACKED (regenerates from a command), or SCOPED (a declared estimate) — see docs/CLAIMS_REGISTRY.md. Nothing bare.

It is not a new provenance standard, benchmark suite, workflow engine, or VVUQ methodology. CAPAS is a profile/costurero over existing standards:

RO-Crate/PROV packaging + sealed route/result trace + VVUQ-style physical evidence + witness independence + honest no-evidence/failure/rejection states.

Who This Is For

CAPAS is for people auditing AI-generated scientific-computation outputs:

  • AI-for-science agent benchmark builders,
  • scientific workflow provenance / RO-Crate users,
  • quantum many-body verification and benchmark users,
  • teams deciding whether a computation supports a strong claim, a weaker rewrite, rejection, or hold.

CAPAS is not for users who only need a faster simulator or generic workflow lineage.

Hybrid Pipeline Role

CAPAS is the deterministic middle of a hybrid AI-for-science verification pipeline:

LLM / extractor / scientific verifier upstream
        -> claim + evidence JSON
        -> CAPAS deterministic claim gate
        -> ACCEPT / REWRITE / REJECT / HOLD

CAPAS now includes an explicit upstream MVP for retrieval/parsing/alignment, but the final decision remains deterministic. Local corpus retrieval, web retrieval with --allow-web, and PDF parsing can prepare candidate evidence; they do not silently invent missing evidence or turn CAPAS into a broad scientific oracle.

The standalone direction has started with a local upstream MVP:

local text / solver log / code excerpt
        -> explicit evidence extraction
        -> deterministic claim-text alignment
        -> CAPAS claim gate

It is intentionally explicit-only and deterministic. See docs/STANDALONE_PRODUCT_ROADMAP.md.

External Review Packet

For reviewers and potential users:

  • Technical note: docs/CAPAS_TECHNICAL_NOTE.md
  • Request for feedback: docs/REQUEST_FOR_FEEDBACK.md
  • QMB100 / PhysVEC one-pager: docs/CAPAS_ONE_PAGER_QMB100.md
  • Global SotA / market audit: docs/GLOBAL_SOTA_MARKET_AUDIT.md
  • Hybrid pipeline positioning: docs/HYBRID_PIPELINE_POSITIONING.md
  • Input hardening notes: docs/INPUT_HARDENING.md
  • Product scope and debt status: docs/PRODUCT_SCOPE_AND_DEBTS.md
  • Public demo: https://fomv9354lve.github.io/capas-inteligentes/
  • Current release: https://github.com/fomv9354lve/capas-inteligentes/releases/tag/v0.1.1
  • Public feedback issue: https://github.com/fomv9354lve/capas-inteligentes/issues/1
  • License: Apache-2.0 (open-core); the CAPAS mark + official certification are reserved — see LICENSE and NOTICE.

The current validation question is not "is CAPAS broadly useful?" It is narrower: do the explicit evidence fields help audit or exchange one scientific-agent computation result, or are they already covered by existing artifacts?

Try In 60 Seconds

python -m pip install -e .
capas decide --input examples/external_claim_rewrite.json
capas batch --input examples/external_claim_batch.json --json
capas provenance-check --input examples/external_claim_fine_tune_ready.json --json
capas pipeline --input examples/standalone_pipeline_semantic_hold.json
capas inspect trace_039
capas validate

Expected behavior:

  • decide returns REWRITE when local checks pass but a universal anchor fails.
  • batch applies the same deterministic gate to multiple claim/evidence objects.
  • pipeline can block a numeric ACCEPT when claim.text overstates the scope of the evidence.
  • inspect trace_039 shows a motor-backed claim-transition trace.
  • validate runs the product gates and profile/RO-Crate checks.

Release assets:

https://github.com/fomv9354lve/capas-inteligentes/releases/tag/v0.1.1

Product Demo

Install the local package entrypoint:

python -m pip install -e .

Run the product surface:

capas demo
python3 benchmarks/verify_capas_product_demo.py

The demo writes:

  • outputs/capas_product_demo_report.json
  • outputs/capas_product_demo_report.md

It demonstrates the current product contract: CAPAS reads sealed scientific computation traces and emits evidence-typed claim decisions (ACCEPT, REWRITE, REJECT, HOLD) without marking the corpus as fine-tune-ready.

Run the core product validators:

capas validate

Run the local integration API:

CAPAS_API_TOKEN=change-me CAPAS_AUDIT_DIR=outputs/api-audit \
  capas serve --host 127.0.0.1 --port 8765
curl -H "Authorization: Bearer change-me" http://127.0.0.1:8765/health

The API is local and deterministic. It exposes GET /health, GET /decisions, POST /decide, POST /batch, and POST /provenance-check. CAPAS_API_TOKEN enables bearer-token auth; CAPAS_AUDIT_DIR writes workspace-scoped JSONL decision logs keyed by X-CAPAS-Workspace. It is not a hosted SaaS service.

The canonical payload schema is published in the static docs tree:

https://fomv9354lve.github.io/capas-inteligentes/schema/v3/capas_claim_payload.schema.json

Enterprise integration notes live in:

  • docs/ENTERPRISE_INTEGRATION_PACK.md
  • docs/PROVENANCE_REGISTRY_OPERATIONS.md
  • docs/SECURITY_COMPLIANCE_APPENDIX.md
  • docs/PILOT_ROI_BUSINESS_CASE.md
  • docs/PAPER_INGESTION_CONNECTORS.md
  • docs/UPSTREAM_GITHUB_ACTIONS_POLICY.md
  • docs/DAILY_SOTA_UPDATE.md

Run CAPAS in GitHub Actions:

jobs:
  claim-gate:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-python@v5
        with:
          python-version: "3.11"
      - uses: ./.github/actions/capas-claim-gate
        with:
          mode: batch
          input: examples/external_claim_batch.json
          output: outputs/capas-action-report.json

The manual workflow .github/workflows/claim-gate.yml runs a batch smoke check through workflow_dispatch. Other workflows can reuse the composite action directly with mode: decide, mode: batch, or mode: pipeline.

Inspect a trace:

capas inspect trace_039

Decide an external claim/evidence JSON:

capas schema
capas check-input --input examples/external_claim_accept.json
capas decide --input examples/external_claim_accept.json
capas decide --input examples/external_claim_rewrite.json
capas decide --input examples/external_claim_hold.json

Run the standalone upstream MVP:

capas retrieve --input examples/standalone_pipeline_multisource.json
capas retrieve --input examples/standalone_pipeline_web_source.json --allow-web
capas extract --input examples/standalone_pipeline_accept.json
capas extract --input examples/standalone_pipeline_pdf_source.json
capas align --input examples/standalone_pipeline_semantic_hold.json
capas reason --input examples/standalone_pipeline_semantic_hold.json
capas pipeline --input examples/standalone_pipeline_semantic_hold.json
python3 benchmarks/verify_standalone_pipeline.py

This is not broad scientific reasoning. It is local explicit parsing plus a deterministic semantic-scope guard before the CAPAS gate.

extract also returns evidence_spans with source_id, line number, snippet, and parser for each extracted field, so a reviewer can audit where the evidence came from.

Local corpus retrieval accepts JSON, JSONL, text, Markdown, or a directory of those files and returns auditable snippets matched to claim terms and required evidence fields. Web retrieval is opt-in with --allow-web. Local PDF parsing is supported when the optional standalone dependency is installed:

python -m pip install -e ".[standalone]"

If web permission or PDF parser support is missing, CAPAS records that as an extraction note instead of silently inventing evidence.

The published MVP input contract is docs/schema/capas_claim_payload.schema.json. check-input validates the payload shape; decide then checks whether the supplied evidence licenses the claim, needs a rewrite, contradicts it, or must hold.

Generate the static local UI:

capas ui
python3 benchmarks/verify_claim_gate_ui.py

The UI exposes ACCEPT, REWRITE, REJECT, HOLD, and INVALID samples. Structurally invalid payloads are shown as HOLD with schema_errors, matching capas decide. It also supports batch evaluation for JSON arrays or objects with items / claims, displays schema v3 for audit trails, and includes a keyboard help modal with the main shortcuts and pipeline scope.

Schema v3 supports 11 deterministic claim types, including causal mechanism, systematic review, evidence conflict, and multimodal evidence gates.

Prepare a non-mutating GitHub release plan:

python3 scripts/publish_github_release.py

Publishing a new release still requires a GitHub remote, valid gh auth, a pushed tag, a passing external CI run, and a GitHub release URL.

The public v0.1.1 release is available at:

https://github.com/fomv9354lve/capas-inteligentes/releases/tag/v0.1.1

Prepare/check external reviewer feedback:

python3 benchmarks/verify_external_user_validation.py

The feedback template is examples/external_reviewer_feedback_template.json. Completed external feedback belongs under outputs/external_validation/.

What It Produces

The corpus builder emits:

  • sealed JSON traces: benchmarks/gold_traces/*.json
  • W3C PROV-shaped exports: benchmarks/gold_traces/*.prov.json
  • RO-Crate metadata: benchmarks/ro_crates/*/ro-crate-metadata.json
  • audit table: audits/gold_trace_audit_template.csv

Evidence Fields

CAPAS adds domain evidence fields over standard provenance:

  • physical_evidence_level
  • verification_independence
  • witness_stack
  • reference_truth
  • evidenceStatus
  • abs_error
  • expected
  • value
  • observable
  • local_property_tests
  • universal_anchor
  • invariant_caught
  • claim_scope

The important distinction is that not all traces are training gold. Some traces exist to prove the format can honestly represent uncertainty, failure, or rejection.

Validate The Published Evidence Corpus

Use a Python environment that has the declared public corpus stack available. Check it first:

python3 scripts/check_reproducibility_env.py

The public repository ships the evidence traces and validators. Private/local scientific engines used during exploratory corpus generation are intentionally not packaged in the public release. The public command runs:

  1. RO-Crate validation
  2. CAPAS physical-evidence profile validation
  3. witness independence validation
  4. evidence claim validation
  5. universal anchor matrix validation
  6. audit summary
python3 scripts/build_evidence_corpus.py

coverage_ready=True is expected. fine_tune_ready=False is also expected until blind inference review is completed.

Validate RO-Crate Coverage

python3 benchmarks/validate_ro_crates.py

The validator checks that evidence is present where it should be and absent where it would be dishonest:

  • analytic success: present
  • cross-sim success: present
  • no-evidence success: none_declared
  • backend failure: not_applicable_failed
  • router rejection: not_applicable_rejected

Validate CAPAS Physical Evidence Profile

python3 benchmarks/validate_capas_profile.py

This checks the local CAPAS profile over Workflow Run RO-Crate-style crates: WRROC profile URIs, ComputationalWorkflow, top-level CreateAction, parameter realization, capas:evidenceStatus, and capas:PhysicalEvidence semantics. This is a local profile validator, not official profile registration.

External RO-Crate Validation

python3 -m pip install -r requirements-validation.txt
python3 benchmarks/validate_ro_crates_external.py

This uses the ResearchObject rocrateValidator package and writes benchmarks/ro_crates/official_validation_report.json. Current crates validate as RO-Crates without warnings. CAPAS emits a recognized .cwl workflow descriptor for packaging and records Python as the implementation language for the costurero.

Positioning

See:

  • docs/PROJECT_DASHBOARD.md
  • docs/REPRODUCIBILITY.md
  • docs/SOTA_POSITIONING.md
  • docs/SOTA_DAILY_WATCH.md
  • docs/DAILY_SOTA_UPDATE.md
  • docs/WORKFLOW_RUN_RO_CRATE_ALIGNMENT.md
  • docs/profile/CAPAS_PHYSICAL_EVIDENCE_PROFILE.md
  • docs/FORMAL_BOUND_AXIS.md
  • docs/WITNESS_INDEPENDENCE_AXIS.md
  • docs/OPTIMIZATION_BRIDGE.md
  • docs/EXPERIMENTAL_EVIDENCE_AXIS.md
  • docs/UNIVERSAL_INVARIANT_ANCHORING.md
  • docs/METAMORPHIC_TESTING_POSITIONING.md
  • docs/SCIAGENTGYM_AUDIT.md
  • docs/QMB100_AUDIT.md
  • docs/VVUQ_QUANTUM_AUDIT.md
  • docs/COVERAGE.md
  • docs/EXTERNAL_MVP_LAUNCH_PLAN.md

Short defensible claim:

CAPAS is an evidence-typed claim gate over scientific-computation traces: it reads or packages provenance-aligned evidence and decides whether that evidence licenses ACCEPT, REWRITE, REJECT, or HOLD.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

capas_claim_gate-0.3.0.tar.gz (198.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

capas_claim_gate-0.3.0-py3-none-any.whl (206.8 kB view details)

Uploaded Python 3

File details

Details for the file capas_claim_gate-0.3.0.tar.gz.

File metadata

  • Download URL: capas_claim_gate-0.3.0.tar.gz
  • Upload date:
  • Size: 198.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for capas_claim_gate-0.3.0.tar.gz
Algorithm Hash digest
SHA256 94b1b5a3805e91125c5977b1a2c51f58e92a8c3854e827d5a59def14c7430b51
MD5 49e845ed86c871b528814a62a2bbf783
BLAKE2b-256 eeead3c400726248bdfefb4c20e063e03128b14501e5dca0d77011b036c22bfa

See more details on using hashes here.

Provenance

The following attestation bundles were made for capas_claim_gate-0.3.0.tar.gz:

Publisher: publish.yml on fomv9354lve/capas-inteligentes

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file capas_claim_gate-0.3.0-py3-none-any.whl.

File metadata

File hashes

Hashes for capas_claim_gate-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 cfc1e161eadee01fe7df47ffb90e2a4e9ec7244f8aceaa5d27f198bd9e5ddc17
MD5 26547a2da1f0951ef67de752d0ed4400
BLAKE2b-256 68ae489f48d17387e4aefe1e6308e55a50f0ee95f39feead477c4ddafaabfc32

See more details on using hashes here.

Provenance

The following attestation bundles were made for capas_claim_gate-0.3.0-py3-none-any.whl:

Publisher: publish.yml on fomv9354lve/capas-inteligentes

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page