Falsification-first reliability testing for AI systems: perturb inputs, preserve replayable evidence, diff reliability across model changes.
Project description
FalsifyAI
FalsifyAI produces replayable, inspectable evidence that AI systems behave reliably under realistic pressure.
Most evaluation tools produce metrics. FalsifyAI produces evidence — durable, structured artifacts that survive the run and let you support reliability claims about a model migration with preserved, inspectable proof.
Status: 0.2.0 — Phase 1 first wave shipped (inspect, history, paraphrase, canonical case study, automated PyPI publishing). Spec language and verdict semantics remain locked for the 0.x line.
pip install falsifyai
For the semantic_equivalence invariant (pulls PyTorch, ~1GB):
pip install "falsifyai[semantic]"
What kind of tool is this?
FalsifyAI is evidence infrastructure for reliability claims about stochastic systems — most immediately, LLMs.
Think of it the way you'd think of:
| Domain | Evidence infrastructure |
|---|---|
| Software supply chain | SBOM (CycloneDX, SPDX) — what's in this build, with provenance |
| Static analysis | SARIF — the structured record of what was scanned and found |
| Build provenance | Sigstore / in-toto — cryptographic attestations about what was built and by whom |
| Security events | Audit logs — preserved, inspectable, defensible after the fact |
| Stochastic-system reliability | FalsifyAI replay artifact — preserved, inspectable, defensible evidence that a model behaved reliably under realistic pressure |
The underlying pattern isn't new. Applying it to stochastic-system reliability is. FalsifyAI is the stochastic-systems analogue of an evidence layer you already know.
The novelty isn't that we preserve evidence — it's what we preserve: every perturbed input, every model output, every invariant judgment, the verdict, the materialized spec, and the identity that ties them together. The CLI compresses; the artifact preserves the receipts.
The core terms
Three definitions that anchor everything else in this document:
Stochastic software can produce meaningfully different outputs for equivalent requests due to probabilistic inference, retrieval variability, tool interactions, or adaptive behavior. LLMs are the most common case today; future AI systems will extend the category.
A reliability claim is a bounded statement about how a stochastic system behaves under specified perturbation pressure, judged by specified invariants. "This case is STABLE under typo_noise and casing" is a reliability claim. "This model is reliable" is not — it's unfalsifiable and unbounded.
Reliability evidence is the preserved, replayable proof supporting a reliability claim. Without evidence, claims are anecdotes. With evidence, claims become inspectable.
In one sentence: FalsifyAI is a tool for producing reliability evidence that supports bounded reliability claims about stochastic software. The replay artifact is the durable object; everything else exists to produce, interpret, or consume one.
The 5-minute proof
The investigation takes three commands. One terminal. Real models. Replayable session IDs at the end.
1. Define what good looks like
If you pip install'd FalsifyAI, the examples aren't on disk yet. Grab one:
curl -O https://raw.githubusercontent.com/ericckzhou/falsifyai/main/examples/model_migration.yaml
Or git clone https://github.com/ericckzhou/falsifyai for all four. Then open examples/model_migration.yaml:
falsify:
version: "1.0"
name: "Model migration regression test"
model:
provider: groq
model: llama-3.3-70b-versatile
run:
seed: 42
cases:
- id: factual_recall
input: { text: "What is the capital of France?" }
expected: { contains: ["Paris"] }
perturbations:
- { type: typo_noise, count: 3 }
- { type: casing }
invariants:
- { type: contains, values: ["Paris"] }
- id: structured_output
input: { text: 'Reply ONLY with a JSON object of the form {"capital": "<city>"}. What is the capital of Japan?' }
expected: { contains: ['"capital"', "Tokyo"] }
perturbations:
- { type: typo_noise, count: 2 }
- { type: casing }
invariants:
- { type: contains, values: ['"capital"', "Tokyo"] }
- id: extraction
input: { text: "Extract only the email addresses from this text: Contact alice@example.com or bob@example.com for details. The deadline is Friday." }
expected: { contains: ["alice@example.com", "bob@example.com"] }
perturbations:
- { type: typo_noise, count: 2 }
- { type: casing }
invariants:
- { type: contains, values: ["alice@example.com", "bob@example.com"] }
- id: policy_summary
input: { text: "Summarize this refund policy in one sentence: Customers can request a refund within 30 days if the item is unused and the receipt is provided." }
expected: { contains: ["30 days", "unused", "receipt"] }
perturbations:
- { type: typo_noise, count: 2 }
- { type: casing }
invariants:
- { type: contains, values: ["30 days", "unused", "receipt"] }
Four cases. One sanity anchor (factual recall) plus three production-shaped contracts: structured output, extraction, grounded policy summarization. The mix is deliberate — a migration regression then looks like a behavioral pattern across contract types, not a single anecdote.
2. Run against your baseline model
$ falsifyai run examples/model_migration.yaml
case: factual_recall verdict: STABLE confidence: 1.00 (CI: 1.00-1.00)
case: structured_output verdict: STABLE confidence: 1.00 (CI: 1.00-1.00)
case: extraction verdict: FRAGILE confidence: 0.00 (CI: 0.00-0.00) worst: typo_noise
case: policy_summary verdict: STABLE confidence: 1.00 (CI: 1.00-1.00)
=================================================================
Session 7e51299481d5420d9181e71ba0449348 -> .falsifyai/replays.db
4 cases, verdict FRAGILE, 1 FRAGILE, 0 CONSISTENTLY_WRONG, falsifiability 0.36
Exit code: 1 (FRAGILE). Three contracts hold under pressure; one (extraction) is already fragile on this baseline — typo noise on alice@example.com corrupts the token and the model drops the address. That's a known weakness, now preserved as evidence. Note the session id — that's your baseline evidence artifact. Commit it to your repo if you want it durable.
3. Switch to the new model. Run again.
Swap model: llama-3.3-70b-versatile for model: openai/gpt-oss-120b (OpenAI's open-weights model, also on Groq). Run again:
$ falsifyai run examples/model_migration.yaml
case: factual_recall verdict: STABLE confidence: 1.00 (CI: 1.00-1.00)
case: structured_output verdict: STABLE confidence: 1.00 (CI: 1.00-1.00)
case: extraction verdict: FRAGILE confidence: 0.00 (CI: 0.00-0.00) worst: typo_noise
case: policy_summary verdict: FRAGILE confidence: 0.00 (CI: 0.00-0.00) worst: typo_noise
=================================================================
Session 4332c0d246bc4b3e875392ecdf3b1780 -> .falsifyai/replays.db
4 cases, verdict FRAGILE, 2 FRAGILE, 0 CONSISTENTLY_WRONG, falsifiability 0.36
Exit code: 1. The new (larger, more recent) model has the same pre-existing extraction weakness — plus a new failure: policy_summary is now fragile under the same typo perturbation that left the baseline untouched. Same spec. Different model. A real, quietly-introduced regression.
4. Diff the two evidence artifacts
$ falsifyai diff 7e51299481d5420d9181e71ba0449348 4332c0d246bc4b3e875392ecdf3b1780
Diff: baseline 7e51299481d5420d9181e71ba0449348 -> candidate 4332c0d246bc4b3e875392ecdf3b1780
Store: .falsifyai/replays.db
=================================================================
case: policy_summary baseline: STABLE (1.00) candidate: FRAGILE (0.00) REGRESSED
=================================================================
1 regressed, 0 improved, 3 unchanged, 0 other, 0 added, 0 removed
Exit code: 5 (REGRESSION). Only the row that changed is shown. The pre-existing extraction fragility is compressed into the unchanged-count footer — that's not the news; the policy summary regression is.
One command. One verdict-class downgrade. One exit code your CI can gate on. One preserved evidence trail you can re-open six months from now and inspect.
5. Replay any past session
$ falsifyai replay --latest
Loaded session 4332c0d246bc4b3e875392ecdf3b1780 · created_at 2026-05-22T... from .falsifyai/replays.db
case: factual_recall verdict: STABLE confidence: 1.00 (CI: 1.00-1.00)
case: structured_output verdict: STABLE confidence: 1.00 (CI: 1.00-1.00)
case: extraction verdict: FRAGILE confidence: 0.00 (CI: 0.00-0.00) worst: typo_noise
case: policy_summary verdict: FRAGILE confidence: 0.00 (CI: 0.00-0.00) worst: typo_noise
=================================================================
4 cases, verdict FRAGILE, 2 FRAGILE, 0 CONSISTENTLY_WRONG, falsifiability 0.36
Replay is read-only. The verdict shown is the one assigned at run time — never re-resolved. The same evidence that triggered the regression alert is preserved indefinitely, even if the model is later deprecated, the API endpoint changes, or your spec evolves.
Without replay artifacts, this entire workflow is anecdotes. "The new model failed our eval on Tuesday" is unverifiable by Friday — the API may have changed, your harness may have been refactored, your colleague may want proof.
With replay artifacts, the workflow produces inspectable evidence. Re-open the artifact six months from now and the claim still stands on its own. That's the whole product. run → replay → diff is one falsification workflow that ends in a preserved, inspectable evidence artifact. Not three commands — one evidence workflow producing one durable record.
For a deeper walkthrough of these same sessions — including
history,inspect, the cross-model extraction finding, and the U+202F invisible-character regression — see Invisible character substitution.
What's in the evidence?
The replay artifact (one row in .falsifyai/replays.db, one row per session) preserves:
- Identity —
session_id(UUID),spec_hash(sha256 of source YAML),materialized_hash(sha256 of realized perturbations),created_at_iso, FalsifyAI version - The materialized spec — every realized perturbation string with its seed and lineage, so the inputs are exactly reproducible
- Every model output — original and perturbed, raw, no post-processing
- Every invariant judgment — which invariant ran on which output, pass/fail, evidence string
- The verdict — assigned at run time using a deterministic priority chain, never re-resolved on read
- Per-perturbation-family stability — stratified bootstrap CI per family, so the "worst case" is attributable
This is the evidence FalsifyAI exists to produce. The CLI compresses it into one row per case + a session summary; the artifact preserves the receipts.
Five concepts, one screen each
Perturbations generate small input variations a real user might produce. Three families ship: typo_noise (character-level mutations), casing_variant (UPPER / lower / Title), and paraphrase (LLM-generated semantic-preserving rewrites, validity-gated via embedding similarity). The first two test character-level robustness; paraphrase tests semantic robustness — an orthogonal pressure axis.
Invariants judge whether a perturbed output is still "the same answer" as the original. contains checks for required substrings; semantic_equivalence compares embedding cosine similarity to a threshold.
Verdicts compress evidence into one of five labels per case:
| Verdict | Meaning | Exit |
|---|---|---|
STABLE |
All perturbations passed the invariants | 0 |
FRAGILE |
Some perturbations failed; model drifts under pressure | 1 |
CONSISTENTLY_WRONG |
Every output (including baseline) violates the ground truth | 2 |
INSUFFICIENT |
Not enough evidence to decide (too few perturbations) | 4 |
INVALID_EVAL |
The evaluation itself is invalid or contradictory | 2 |
Verdicts use stratified bootstrap CI — each perturbation family is resampled independently, and the worst-case CI lower bound wins. A model that survives typos but breaks under casing reports the casing stability number, not an aggregated average that hides the failure. The verdict is a claim, and the artifact is what the claim rests on.
Replay artifacts are the system's promise that claims are inspectable evidence, not anecdotes. They preserve the full evidence trail per session as described above. The verdict shown on replay is the one assigned at run time — replay never re-resolves.
Diff compares two artifacts case-by-case. The regression criterion is a binary verdict-class downgrade — STABLE → FRAGILE, STABLE → CONSISTENTLY_WRONG, or FRAGILE → CONSISTENTLY_WRONG. A competent user can predict the exit code from the two verdicts; there are no hidden thresholds. That predictability is the whole point — see "Resolver predictability" below.
For the full evidence-system semantics — what guarantees the artifact makes, what the verdict means as a claim — see docs/EVIDENCE.md. For the full philosophy, see docs/ARCHITECTURE.md.
Resolver predictability
The verdict resolver is the epistemic authority of the framework — the thing that says "this case is FRAGILE". Every downstream claim (replay, diff, CI gate, migration decision) rests on it.
The architectural discipline: a competent user must be able to predict the resolver's output from the inputs. If a careful engineer reading the spec, the perturbations, the executions, and the invariant results can reasonably anticipate the verdict, the resolver is legible. If they can't, it's a black box — regardless of how technically correct its internals are.
This isn't just an aesthetic choice. It's what makes the evidence auditable. An opaque resolver produces unfalsifiable claims; a predictable one produces defensible claims. The discipline is in service of the evidence — it's why an auditor (or a future you) can trust what's in the artifact.
See docs/ARCHITECTURE.md for the full discussion and the architectural rules that protect predictability as the project grows.
What FalsifyAI is not
The category clarity above implies things FalsifyAI deliberately is not, and is not aspiring to become:
- Not a prompt optimization suite. No prompt tuning, no automated A/B over wordings. The spec is authored deliberately; the framework tests what's authored.
- Not a telemetry platform. No streaming, no production dashboards, no time-series. The artifact is per-run preserved evidence, not a continuous-monitoring data point.
- Not a generalized observability product. The CLI compresses; the artifact preserves. That's prioritized visibility, not less visibility — the headline tells you whether to look, the artifact tells you what to look at. There is no firehose drill-down.
- Not a workflow orchestrator. No DAG runner, no pipeline engine. The three commands (
run/replay/diff) are the entire surface. - Not an AI governance suite. Governance platforms consume reliability evidence; FalsifyAI produces it. Different layer.
These exclusions matter because they keep the surface compressible. Adding any of the above corrupts the discipline — evidence density requires evidence boundaries.
Architecture
Three layers, separated by design. The replay artifact is the central object; the other two layers exist to produce and interpret it.
flowchart LR
subgraph Generation["Evidence generation"]
Spec[Spec / YAML]
Mat[Materialize]
Exec[Execute]
Spec --> Mat --> Exec
end
subgraph Interpretation["Evidence interpretation"]
Inv[Invariants]
Res[Verdict resolver]
Ren[CLI render]
Inv --> Res --> Ren
end
subgraph Preservation["Evidence preservation — the product"]
Art[ReplayArtifact]
Store[ReplayStore]
Art --> Store
end
Exec --> Inv
Ren --> Art
Store -.->|replay/diff| Ren
ASCII fallback (for PyPI / mobile readers):
EVIDENCE GENERATION EVIDENCE INTERPRETATION EVIDENCE PRESERVATION
───────────────────── ─────────────────────── (the durable product)
spec.yaml invariants ─────────────────────
│ verdict resolver ReplayArtifact
▼ CLI render ReplayStore
materialize │ ▲
│ │ │
▼ ▼ │
execute ────────────────────────▶ judge ────────────▶ resolve ───────┘
│
┌── falsifyai run │
│── falsifyai replay │
│── falsifyai inspect│ (consumers read
│── falsifyai diff │ the artifact)
└── falsifyai history│
A future feature touches exactly one layer. Adaptive evidence collection is interpretation, not generation. A new perturbation family is generation, not interpretation. A new verdict shape is interpretation, not preservation. The separation is what keeps the resolver explainable as the project grows — see docs/ARCHITECTURE.md and the philosophy section of CONTRIBUTING.md.
CLI reference
Three subcommands, one workflow:
falsifyai run <spec.yaml> [--store-path PATH]
falsifyai replay <session_id> [--store-path PATH]
falsifyai replay --latest [--store-path PATH]
falsifyai inspect <session_id> [--case CASE_ID] [--full] [--store-path PATH]
falsifyai diff <baseline_id> <candidate_id> [--store-path PATH]
falsifyai history <case_id> [--limit N] [--store-path PATH]
| Exit code | Meaning |
|---|---|
| 0 | SUCCESS — session verdict STABLE |
| 1 | DEGRADED — session verdict FRAGILE |
| 2 | FAILURE — session verdict CONSISTENTLY_WRONG or INVALID_EVAL |
| 3 | ERROR — infrastructure failure (bad spec, missing credential, model call failure) |
| 4 | INSUFFICIENT — not enough evidence to decide |
| 5 | REGRESSION — falsifyai diff detected a verdict-class downgrade |
Default --store-path is .falsifyai/replays.db. Use :memory: for ephemeral runs (test-only; replay and diff need a persistent store).
CI integration
Ship the evidence with your PR, not just the pass/fail signal:
- name: Reliability regression gate
env:
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
run: |
KNOWN_GOOD="${{ vars.FALSIFYAI_BASELINE_SESSION_ID }}"
falsifyai run eval.yaml
CANDIDATE=$(sqlite3 .falsifyai/replays.db \
"SELECT session_id FROM sessions ORDER BY created_at_iso DESC LIMIT 1;")
falsifyai diff "$KNOWN_GOOD" "$CANDIDATE"
# Exit 5 = regression; the job fails.
The KNOWN_GOOD variable is a session id you captured locally against the production model and committed as a repo / org variable. CI runs the eval against the candidate model and diffs — exit 5 (REGRESSION) fails the job. Zero thresholds to tune; the regression criterion is the verdict-class downgrade. The full evidence artifact is preserved in .falsifyai/replays.db and can be archived as a CI artifact for later inspection.
Examples
Four dogfooded specs, all verified in CI (tests/integration/test_examples.py):
| Example | Verdict | What it demonstrates |
|---|---|---|
examples/stable.yaml |
STABLE (exit 0) |
A sane model under perturbation; both perturbation families + both invariants. |
examples/fragile.yaml |
FRAGILE (exit 1) |
Model drift: baseline correct, perturbations wrong. |
examples/consistently_wrong.yaml |
CONSISTENTLY_WRONG (exit 2) |
Confident hallucination: same wrong answer under every perturbation. |
examples/model_migration.yaml |
regression (exit 5) | The launch wedge — run twice, diff, exit 5 if any case regressed. The 5-minute proof above uses this spec. |
Run any of them:
falsifyai run examples/stable.yaml
A real provider is required at runtime (OPENAI_API_KEY, GROQ_API_KEY, etc. — whichever your spec's provider: field points at). The dogfood tests in CI bypass real model calls by injecting MockAdapter through a test seam — see tests/integration/test_examples.py for the pattern.
Case studies
Worked tours of FalsifyAI's evidence infrastructure over real preserved artifacts. Each case study is itself a FalsifyAI artifact: a ReplayStore bundle plus prose that walks through what history, diff, inspect, and replay reveal when read against it.
| # | Title | What it demonstrates |
|---|---|---|
| 01 | Invisible character substitution | Cross-model contains-contract brittleness as a persistent class; a model-migration regression (U+202F substitution between "30" and "days") as the vivid instance. |
See docs/case-studies/ for the index, the bundled replay artifact (SHA256 in provenance README), and the framing convention case studies follow.
Writing your own spec
The shortest valid spec (tests/fixtures/specs/minimal.yaml):
falsify:
version: "1.0"
name: "minimal"
model:
provider: openai
model: gpt-4o-mini
run:
seed: 42
cases:
- id: hello
input:
text: "Say hi."
perturbations:
- type: typo_noise
invariants:
- type: contains
values: ["hi"]
The full spec schema (perturbation parameters, invariant types, verdict thresholds) is in plan.md §6. The spec language is locked for the 0.1.x line.
Local development
Requires Python 3.13+ and uv.
git clone https://github.com/ericckzhou/falsifyai
cd falsifyai
uv sync --extra dev
uv run pytest
Contributions follow the conventions in CONTRIBUTING.md. Architectural constraints (especially: resist resolver inflation) are non-negotiable; see that doc for the trust test any resolver-touching PR must pass.
Status and roadmap
0.2.0 (current release) — Phase 1 first wave. Adds:
- ✅
falsifyai inspect <session_id>— per-case deep-dive over preserved evidence. Surfaces every perturbed input, output, and invariant judgment.--case <case_id>expands one case;--fulldisables truncation. Pure consumer surface — the artifact already contained the data. - ✅
paraphraseperturbation family — LLM-generated semantic-preserving rewrites with embedding-similarity validity gating. Tests semantic robustness as an orthogonal pressure axis to the character-level families. Configurable per-spec (count,similarity_threshold,max_attempts, optionalmodeloverride). - ✅
falsifyai history <case_id>— temporal view of one case across saved sessions. Newest-first, one row per session, showing verdict + CI + worst family per row. Readscase.verdictfrom preserved artifacts; no aggregation, no trend inference, no reinterpretation. - ✅ Canonical case study — Invisible character substitution: cross-model
contains-contract brittleness as the thesis (history), Pair 3 model-migration regression as the vivid concrete proof (diff+inspect), over a bundled replay artifact you can re-open and reproduce verbatim. - ✅ Automated PyPI publishing via Trusted Publisher (OIDC) —
.github/workflows/publish.ymlfires on anyv*tag push: verifies version match, re-runs tests, builds, validates, publishes. No long-lived tokens in repo.
0.1.0 — Phase 0 MVP. Spec language, perturbation runtime, materializer, invariants, execution adapter, replay store, real verdict resolver (stratified bootstrap CI, CONSISTENTLY_WRONG, falsifiability scoring), and the three-command CLI (run + replay + diff).
Coming next — selected by evidence, not theoretical completeness:
diffsharpening —--strict,--show-trending, exit code 6 for low-falsifiability gates. Tightens the binary regression criterion for users who want finer CI control without compromising resolver predictability.- Artifact infrastructure track —
falsifyai verify <session_id>(integrity + provenance),falsifyai export --bundle(productize the case-study extraction pattern), and a persisted CLI-invocation field inReplayArtifact. Locked sequence; reassess after a second case study or real user pressure.
Each addition is evaluated against: does this preserve evidence density, resolver predictability, and the discipline that makes the artifact trustworthy? See docs/ARCHITECTURE.md, docs/EVIDENCE.md, and CONTRIBUTING.md for the discipline.
License
Apache 2.0 — see LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file falsifyai-0.2.0.tar.gz.
File metadata
- Download URL: falsifyai-0.2.0.tar.gz
- Upload date:
- Size: 326.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
06a869054a0a5aa2ca197cd05c1fa51aeca47d56f5c72b77f381b868071d6fdd
|
|
| MD5 |
cb89d54258cffda5d6b11bf4e92f23a4
|
|
| BLAKE2b-256 |
552383eb5e80ca57d84e0e4e6d49c508d4b22e74494287851d101a1041ad6061
|
Provenance
The following attestation bundles were made for falsifyai-0.2.0.tar.gz:
Publisher:
publish.yml on ericckzhou/falsifyai
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
falsifyai-0.2.0.tar.gz -
Subject digest:
06a869054a0a5aa2ca197cd05c1fa51aeca47d56f5c72b77f381b868071d6fdd - Sigstore transparency entry: 1610687661
- Sigstore integration time:
-
Permalink:
ericckzhou/falsifyai@e0084ea92eab148597784ae1783c3dee327d0a06 -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/ericckzhou
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@e0084ea92eab148597784ae1783c3dee327d0a06 -
Trigger Event:
push
-
Statement type:
File details
Details for the file falsifyai-0.2.0-py3-none-any.whl.
File metadata
- Download URL: falsifyai-0.2.0-py3-none-any.whl
- Upload date:
- Size: 79.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fa01d32a875435ed6c92d1660fc9a1ceb505a75197f773e420e0ff891b64ec56
|
|
| MD5 |
5198896efd28c7293e684631af4d5735
|
|
| BLAKE2b-256 |
9f8299c72706b7507336bd6dfd5d3303aeef1fe67fd6837af09fa96861fbbe8d
|
Provenance
The following attestation bundles were made for falsifyai-0.2.0-py3-none-any.whl:
Publisher:
publish.yml on ericckzhou/falsifyai
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
falsifyai-0.2.0-py3-none-any.whl -
Subject digest:
fa01d32a875435ed6c92d1660fc9a1ceb505a75197f773e420e0ff891b64ec56 - Sigstore transparency entry: 1610687746
- Sigstore integration time:
-
Permalink:
ericckzhou/falsifyai@e0084ea92eab148597784ae1783c3dee327d0a06 -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/ericckzhou
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@e0084ea92eab148597784ae1783c3dee327d0a06 -
Trigger Event:
push
-
Statement type: