SpanForge — AI lifecycle and governance platform (RFC-0001 SPANFORGE)

These details have not been verified by PyPI

Project links

Project description

spanforge

The AI Compliance Platform for Agentic Systems.
Ship AI applications that are auditable, regulator-ready, and privacy-safe — from day one.

Built on RFC-0001 — the SpanForge AI Compliance Standard for agentic AI systems.

Python 3.9+ 91% test coverage 6541 tests Version 1.0.1 Zero dependencies Free for personal, research, and open-source use

The problem

You're building AI applications in a world where regulators are catching up fast. The EU AI Act is in force. GDPR applies to every LLM that touches personal data. SOC 2 auditors want evidence that your AI systems are governed. And your team is stitching together ad-hoc logs, hoping they'll hold up in an audit.

spanforge solves this. It is a compliance-first platform — not a monitoring add-on — that gives every AI action in your stack a cryptographically signed, privacy-safe, regulator-ready record.

If you're a solo developer or early-stage startup

You might think compliance is a later problem — something to worry about when you have a legal team. Here's why it isn't:

You'll hit it sooner than you think. The first B2B customer, the first SaaS sign-up from an EU user, the first healthcare or fintech pilot — they'll ask "how do you govern your AI?" If you have no answer, you lose the deal.
Retrofitting is expensive. Adding audit trails, PII scrubbing, and signed evidence chains to an existing system takes weeks. Adding them with spanforge from day one takes minutes.
It's zero-cost to start. The entire SDK is free for noncommercial use, zero dependencies, and works in-memory with no infrastructure. You don't pay anything until you need hosted storage.
It de-risks you personally. GDPR fines apply to individuals running services, not just corporations. PII redaction and tamper-proof logs are your protection too.

In short: spanforge is the logging import you should have added on day one — except it also signs your audit trail and maps it to the regulations that will eventually matter to you.

pip install spanforge  # free for noncommercial use, zero deps

import spanforge
spanforge.configure()  # that's it — you're now compliant-by-default

What spanforge does

Compliance & Regulatory Mapping

Map telemetry to EU AI Act, GDPR, SOC 2, HIPAA, ISO 42001, NIST AI RMF clauses automatically
Generate HMAC-signed evidence packages with gap analysis
Track consent boundaries, HITL oversight, model registry governance, and explainability coverage
Produce audit-ready attestations with model owner, risk tier, and status metadata
Compliance Evidence Chain (sf-cec) — signed ZIP bundles with regulatory clause maps, DPA generation, and RFC 3161 timestamps for auditor hand-off; spanforge audit cec generate CLI generates CEC bundles without Python code
Human-in-the-Loop Workflow Engine (spanforge.workflow) — approval workflows for gate reviews, policy sign-offs, and escalations; full state machine (PENDING → APPROVED / REJECTED → CLOSED) with SLA auto-escalation and role-based action matrix
Observability SDK (sf-observe) — span export (OTLP/Datadog/Grafana/Splunk/Elastic), W3C TraceContext, OTel GenAI attrs, sampling strategies, annotation store, and health probes
CI/CD Gate Pipeline (sf-gate) — evaluate release quality gates (schema, secrets, performance, PRRI, trust), YAML pipeline engine, artifact store, and blocking trust gate to prevent unsafe releases
T.R.U.S.T. Scorecard (sf-trust) — five-pillar trust dimensions (Transparency · Reliability · UserTrust · Security · Traceability), configurable weights, SVG badge, history time-series, and 5 HallucCheck pipeline integrations (score, bias, monitor, risk, benchmark)
Enterprise Hardening (sf-enterprise) — multi-tenancy with project-level isolation, data residency enforcement (EU/US/AP/IN), AES-256-GCM encryption at rest, envelope encryption via cloud KMS, mTLS, FIPS 140-2 mode, air-gap offline mode, and container health endpoints
Security Review (sf-security) — OWASP API Security Top 10 audit, STRIDE threat modelling, dependency vulnerability scanning, static analysis, and secrets-in-logs detection

Privacy & Audit Infrastructure- Secrets scanning — 20-pattern registry detects API keys, tokens, private keys; SARIF output; pre-commit hook- PII redaction — detect and strip sensitive data before it leaves your app. Includes a Presidio NLP backend (`spanforge[presidio]`) covering 15 entity types (SSN, email, phone, AADHAAR, PAN, UK NI, credit card, IBAN, and more) with ≥ 95% true-positive rate and < 0.5% false-positive rate verified at GA

HMAC audit chains — tamper-evident, blockchain-style event signing- Audit SDK (sf-audit) — sf_audit.append(), schema key registry, T.R.U.S.T. scorecard, GDPR Article 30 RoPA, BYOS cloud routing- GDPR subject erasure — right-to-erasure with tombstone events that preserve chain integrity
Air-gapped deployment — runs fully offline with zero egress

Governance & Controls

Consent boundary monitoring — consent.granted, consent.revoked, consent.violation events
Human-in-the-loop hooks — hitl.queued, hitl.reviewed, hitl.escalated, hitl.timeout events
Model registry — register, deprecate, retire models; attestations auto-warn on ungoverned models
Explainability tracking — measure what % of AI decisions have explanations attached

Developer Experience

Zero required dependencies — pure Python 3.9+ stdlib
One-line setup — spanforge.configure() and you're compliant
Integration config — .halluccheck.toml config block, service registry, local fallbacks for all 11 services
T.R.U.S.T. Scorecard (sf-trust) — five-pillar trust assessment (Transparency, Reliability, UserTrust, Security, Traceability), SVG badge generation, HallucCheck pipeline integrations
Mock library — spanforge.testing_mocks — 11 drop-in mock service clients + mock_all_services() context manager for zero-network unit tests
Sandbox mode — [spanforge] sandbox = true routes all service calls to local in-memory sandbox
spanforge doctor — environment diagnostics: config valid, services reachable, patterns loaded, gate YAML valid
Auto-instrumentation — patch OpenAI, Anthropic, LangChain, CrewAI, and more; @trace_rag decorator and automatic LlamaIndex/LangChain retriever instrumentation for zero-change RAG tracing
Async SDK — every major SDK method now has a non-blocking *_async() variant (scan_async, evaluate_async, build_bundle_async, get_scorecard_async, sso_delegate_session_async) for seamless use in async frameworks
User feedback REST endpoint — POST /v1/feedback accepts star/thumbs/Likert ratings and free-text comments (SHA-256 hashed); links to T.R.U.S.T. dimensions
38 CLI commands — compliance checks, PII scans, secrets scanning, audit-chain verification, event generation, audit log extraction, CEC bundle generation, gap detection, gate policy audit, CI/CD gate pipelines, trust scorecards, config validation, enterprise health, security scanning, doctor diagnostics, all CI-ready

How it compares

spanforge is the only open-standard, zero-dependency AI compliance platform. Other tools are monitoring platforms that bolt on compliance as an afterthought. spanforge is compliance infrastructure that happens to capture the telemetry needed to prove it.

Capability	spanforge	LangSmith	Langfuse	OpenLLMetry	Arize Phoenix
Regulatory framework mapping (EU AI Act, GDPR, SOC 2…)	✅	❌	❌	❌	❌
HMAC-signed evidence packages & attestations	✅	❌	❌	❌	❌
Consent boundary monitoring	✅	❌	❌	❌	❌
Human-in-the-loop compliance events	✅	❌	❌	❌	❌
Model registry with risk-tier governance	✅	❌	❌	❌	❌
Explainability coverage metrics	✅	❌	❌	❌	❌
Built-in PII redaction	✅	❌	❌	❌	❌
Tamper-proof audit chain	✅	❌	❌	❌	❌
GDPR subject erasure (right-to-erasure)	✅	❌	❌	❌	❌
Works fully offline / air-gapped	✅	❌	Self-host	Partial	Self-host
Open schema standard (RFC-driven)	✅	❌	❌	Partial	❌
Zero required dependencies	✅	❌	❌	❌	❌
OTLP export (any OTel backend)	✅	❌	✅	✅	✅
Source-available, no call-home	✅	Partial	✅	✅	✅
CI/CD release quality gates (schema, secrets, PRRI, trust gate)	✅	❌	❌	❌	❌

Bottom line: Others help you watch your AI. spanforge helps you govern it.

Install

pip install spanforge

Requires Python 3.9+. Zero mandatory dependencies.

Documented CLI entrypoints, version consistency, and known stale doc patterns are also checked in CI so public docs drift fails fast during PR validation.

Optional extras

pip install "spanforge[openai]"       # OpenAI auto-instrumentation
pip install "spanforge[langchain]"    # LangChain callback handler
pip install "spanforge[crewai]"       # CrewAI callback handler
pip install "spanforge[http]"         # Webhook + OTLP export
pip install "spanforge[datadog]"      # Datadog APM + metrics
pip install "spanforge[kafka]"        # Kafka EventStream source
pip install "spanforge[pydantic]"     # Pydantic v2 model layer
pip install "spanforge[otel]"         # OpenTelemetry SDK integration
pip install "spanforge[jsonschema]"   # Strict JSON Schema validation
pip install "spanforge[llamaindex]"   # LlamaIndex event handler
pip install "spanforge[gemini]"       # Google Gemini auto-instrumentation
pip install "spanforge[bedrock]"      # AWS Bedrock Converse API
pip install "spanforge[presidio]"     # Presidio-powered PII detection
pip install "spanforge[all]"          # everything above

Runtime Governance GA Surface

The GA implementation spine is the runtime-governance control plane:

sf_explain for signed runtime explanations — now with ExplainModelType classification (LLM, RAG, MULTI_AGENT, CLASSIFIER, EMBEDDING), configurable retry, and fail-safe emit
sf_scope for agent capability enforcement — now with circuit-breaker fail-secure mode and ACTION_CATEGORIES dictionary
sf_rbac for role enforcement on sensitive actions — now with STANDARD_ROLE_MATRIX (10 canonical actor types), YAML manifest loading, and JWT claim extraction
sf_rag for grounding evidence and thresholds
sf_lineage for provenance capture
sf_policy for policy activation, replay, simulation, and review
sf_operator for trace inspection and signed operator exports
sf_enterprise for deployment posture and enterprise evidence packaging

Start here if you want the end-to-end story instead of the full product surface:

Quick start — compliance in 5 minutes

1. Configure and instrument

import spanforge

spanforge.configure(
    service_name="my-agent",
    signing_key="your-org-secret",      # HMAC audit chain — tamper-proof
    redaction_policy="gdpr",            # PII stripped before export
    exporter="jsonl",
    endpoint="audit.jsonl",
)

Every event your app emits is now signed, PII-redacted, and stored — with zero per-call boilerplate.

2. Trace AI decisions

with spanforge.start_trace("loan-approval-agent") as trace:
    with trace.llm_call("gpt-4o", temperature=0.2) as span:
        decision = call_llm(prompt)
        span.set_token_usage(input=512, output=200, total=712)
        span.set_status("ok")

3. Generate compliance evidence

from spanforge.core.compliance_mapping import ComplianceMappingEngine

engine = ComplianceMappingEngine()
package = engine.generate_evidence_package(
    model_id="gpt-4o",
    framework="eu_ai_act",
    from_date="2026-01-01",
    to_date="2026-03-31",
    audit_events=events,
)

print(package.attestation.coverage_pct)            # e.g. 87.5%
print(package.attestation.explanation_coverage_pct) # e.g. 75.0%
print(package.attestation.model_risk_tier)          # e.g. "high"
print(package.gap_report)                           # what's missing

Or from the CLI:

spanforge compliance generate \
  --model-id gpt-4o \
  --framework eu_ai_act \
  --from 2026-01-01 --to 2026-03-31 \
  --events-file audit.jsonl

4. Hand to your auditor

The evidence package contains:

Clause mappings — which telemetry events satisfy which regulatory clauses
Gap analysis — which clauses lack evidence and need attention
HMAC-signed attestation — cryptographic proof the evidence hasn't been tampered with
Model governance metadata — owner, risk tier, status, warnings for deprecated/retired models
Explanation coverage — percentage of AI decisions with explainability records

5. Package for auditors with sf-cec (v2.0.4+)

Bundle your audit records into a regulator-ready, HMAC-signed ZIP:

from spanforge.sdk import sf_cec

# Build a compliance evidence bundle for Q1 2026
result = sf_cec.build_bundle(
    project_id="my-agent",
    date_range=("2026-01-01", "2026-03-31"),
    frameworks=["eu_ai_act", "iso_42001", "soc2"],
)

print(result.bundle_id)       # sfcec_my-agent_20260401T000000Z_abc123
print(result.zip_path)        # /tmp/sfcec/halluccheck_cec_my-agent_2026-01-01_2026-03-31.zip
print(result.hmac_manifest)   # hmac-sha256:a3f9…
print(result.record_counts)   # {"halluccheck.score.v1": 214, "halluccheck.bias.v1": 87, …}

# Verify bundle integrity before sharing
verify = sf_cec.verify_bundle(result.zip_path)
assert verify.overall_valid

# Generate a GDPR Art. 28 Data Processing Agreement
dpa = sf_cec.generate_dpa(
    project_id="my-agent",
    controller_details={"name": "Acme Corp", "contact": "dpo@acme.com"},
    processor_details={"name": "ML Platform Team"},
)
print(dpa.document_id)  # sfcec-dpa-my-agent-20260401

The ZIP bundle contains:

manifest.json — record inventory with HMAC-SHA256 signature
clause_map.json — per-framework clause satisfaction (SATISFIED / PARTIAL / GAP)
chain_proof.json — audit chain verification result
attestation.json — HMAC-signed attestation metadata
rfc3161_timestamp.tsr — trusted timestamp stub (RFC 3161)
score_records/, bias_reports/, prri_records/, drift_events/, pii_detections/, gate_evaluations/ — NDJSON evidence per schema key

6. Observe spans with sf-observe (v2.0.5+)

Export spans to any OTLP-compatible backend, emit structured annotations, and trace LLM calls with OTel GenAI semantic conventions:

from spanforge.sdk import sf_observe

# Emit a span for an LLM call — W3C traceparent + OTel GenAI attrs added automatically
span_id = sf_observe.emit_span(
    "chat.completion",
    {
        "gen_ai.system": "openai",
        "gen_ai.request.model": "gpt-4o",
        "gen_ai.usage.input_tokens": 512,
        "gen_ai.usage.output_tokens": 64,
    },
)
print(span_id)  # "a3f1b2c4d5e6f708"

# Mark a model deployment
annotation_id = sf_observe.add_annotation(
    "model_deployed",
    {"model": "gpt-4o", "environment": "production"},
    project_id="my-agent",
)

# Health probe
print(sf_observe.healthy)         # True
print(sf_observe.last_export_at)  # ISO-8601 or None

# Export to any OTLP endpoint per-call
from spanforge.sdk import ReceiverConfig
result = sf_observe.export_spans(
    my_spans,
    receiver_config=ReceiverConfig(
        endpoint="https://otel.collector.example.com/v1/traces",
        headers={"Authorization": "Bearer tok"},
    ),
)
print(result.exported_count, result.backend)

Select backend and sampler via environment:

export SPANFORGE_OBSERVE_BACKEND=otlp          # otlp | datadog | grafana | splunk | elastic | local
export SPANFORGE_OBSERVE_SAMPLER=trace_id_ratio
export SPANFORGE_OBSERVE_SAMPLE_RATE=0.25

7. Route alerts with sf-alert (v2.0.6+)

Publish topic-based alerts to Slack, PagerDuty, OpsGenie, Teams, SMS, and custom webhooks — with built-in deduplication, escalation policy, and maintenance-window suppression:

from spanforge.sdk import sf_alert

# Publish a CRITICAL drift alert
result = sf_alert.publish(
    "halluccheck.drift.red",
    {"model": "gpt-4o", "drift_score": 0.91},
    severity="critical",
    project_id="my-agent",
)
print(result.alert_id)    # UUID4
print(result.suppressed)  # True if deduplicated / maintenance window

# Acknowledge to cancel the 15-minute escalation timer
sf_alert.acknowledge(result.alert_id)

# Register a custom topic
sf_alert.register_topic(
    "myapp.pipeline.failed",
    "ML pipeline execution failure",
    "high",
    runbook_url="https://runbooks.example.com/pipeline",
)

Configure sinks via environment variables (zero code required):

export SPANFORGE_ALERT_TEAMS_WEBHOOK=https://xxx.webhook.office.com/...
export SPANFORGE_ALERT_OPSGENIE_KEY=og-key-...
export SPANFORGE_ALERT_DEDUP_SECONDS=300

8. Enforce release gates with sf-gate (v2.0.7+)

Run YAML-declared quality gates before every release. Block on schema violations, secrets leaks, performance regressions, unsafe PRRI scores, and trust failures — all in a single pipeline command:

from spanforge.sdk import sf_gate

# Run a full YAML gate pipeline — blocks on any FAIL gate
result = sf_gate.run_pipeline("gates/ci-pipeline.yaml")
for g in result.gate_results:
    print(f"[{g.verdict.value}] {g.gate_id}")  # e.g. [PASS] schema-validation

# Evaluate a single gate programmatically
verdict = sf_gate.evaluate("schema-validation", event.to_dict())
print(verdict.verdict)   # GateVerdict.PASS

# Standalone PRRI evaluation
prri = sf_gate.evaluate_prri(prri_score=28.5)
print(prri.verdict)      # PRRIVerdict.GREEN

# Composite trust gate — checks HRI rate, PII, and secrets windows
trust = sf_gate.get_status()
print(trust.healthy)     # True if all thresholds are within bounds

Or from CI directly:

# Runs the pipeline, exits 1 if any blocking gate fails
spanforge gate run gates/ci-pipeline.yaml

# Enforce the composite trust gate as a deployment prerequisite
spanforge gate trust-gate --project-id my-agent

A minimal ci-pipeline.yaml:

version: "1.0"
gates:
  - id: schema-validation
    type: schema_validation
    on_fail: block
  - id: secrets-scan
    type: secrets_scan
    on_fail: block
  - id: prri-check
    type: halluccheck_prri
    params:
      red_threshold: 65
    on_fail: block
  - id: trust-gate
    type: halluccheck_trust
    on_fail: block

9. Unified config & local fallback (v2.0.8+)

Bootstrap all 8 services from a single .halluccheck.toml config block. When a remote service is unreachable, the SDK automatically falls back to a local-mode equivalent — no code changes required:

# .halluccheck.toml
[spanforge]
enabled    = true
project_id = "my-agent"
endpoint   = "https://api.spanforge.example.com"

[spanforge.services]
sf_pii     = true
sf_secrets = true
sf_audit   = true
sf_observe = true

[spanforge.local_fallback]
enabled     = true
max_retries = 3
timeout_ms  = 2000

from spanforge.sdk import load_config_file, validate_config

# Parse, validate, and apply env-var overrides in one call
config = load_config_file()                # auto-discovers .halluccheck.toml
errors = validate_config(config)           # [] when valid
print(config.services.sf_pii)             # True
print(config.local_fallback.timeout_ms)   # 2000

Validate from the CLI:

spanforge config validate                          # auto-discover
spanforge config validate --file .halluccheck.toml # explicit path

When a service is down, fallback activates automatically:

from spanforge.sdk import pii_fallback, secrets_fallback, audit_fallback

# Local regex PII scan (no remote service required)
result = pii_fallback("Contact alice@example.com")
print(result["entities"])  # [{"type": "EMAIL", ...}]

# Local secrets scan
result = secrets_fallback("AKIA1234567890ABCDEF")
print(result["clean"])     # False

# Local HMAC-chained JSONL audit
audit_fallback(
    {"score": 0.92, "model": "gpt-4o"},
    schema_key="halluccheck.score.v1",
)

The ServiceRegistry tracks health for all services and re-checks every 60 s:

from spanforge.sdk import ServiceRegistry

reg = ServiceRegistry.get_instance()
status = reg.status_response()
# {"sf_pii": {"status": "up", "latency_ms": 45, "last_checked_at": "..."}, ...}

10. T.R.U.S.T. Scorecard & HallucCheck pipelines (v2.0.9+)

The T.R.U.S.T. scorecard aggregates five trust dimensions into a single weighted score with colour-band verdicts. Each pillar maps to existing audit telemetry:

Pillar	What it measures	Source
Transparency	Gate pass rate	`sf_gate` evaluations
Reliability	Hallucination rate	`halluccheck.score.v1` records
UserTrust	Bias disparity	`halluccheck.bias.v1` records
Security	PII + secrets hygiene	`sf_pii` / `sf_secrets` scans
Traceability	Compliance posture	Attestation coverage

Colour bands: green ≥ 80, amber ≥ 60, red < 60.

from spanforge.sdk import sf_trust

# Full scorecard with all five dimensions
scorecard = sf_trust.get_scorecard(project_id="my-agent")
print(scorecard.overall_score)   # 82.5
print(scorecard.colour_band)     # "green"
print(scorecard.reliability)     # TrustDimension(score=90.0, trend="up", ...)

# SVG badge for dashboards / README shields
badge = sf_trust.get_badge(project_id="my-agent")
with open("trust-badge.svg", "w") as f:
    f.write(badge.svg)

# Historical time-series (10 buckets)
history = sf_trust.get_history(project_id="my-agent", buckets=10)
for entry in history:
    print(entry.timestamp, entry.overall)

Five HallucCheck pipeline integrations orchestrate cross-service workflows:

from spanforge.sdk.pipelines import (
    score_pipeline,
    bias_pipeline,
    monitor_pipeline,
    risk_pipeline,
    benchmark_pipeline,
)

# Score pipeline: PII scan → secrets scan → observe span → audit append
result = score_pipeline("The model output to check", model="gpt-4o")
print(result.audit_id, result.details)

# Risk pipeline: PRRI evaluation → alert if RED → gate block → CEC bundle
result = risk_pipeline(prri_score=75.0, project_id="my-agent")
print(result.details["verdict"])  # "RED"

From the CLI:

# T.R.U.S.T. scorecard (text table)
spanforge trust scorecard --project-id my-agent

# SVG badge to stdout
spanforge trust badge --project-id my-agent > trust.svg

# Composite trust gate (exit 1 = trust below threshold)
spanforge trust gate --project-id my-agent

11. Test with zero-network mocks (v2.0.11+)

Drop-in mock service clients for every SpanForge SDK service — no network, no configuration, no side-effects:

from spanforge.testing_mocks import mock_all_services

with mock_all_services():
    from spanforge.sdk import sf_pii, sf_audit, sf_gate

    # All calls are local, recorded, and return sensible defaults
    result = sf_pii.scan_text("Contact alice@example.com")
    assert result.clean  # mock returns clean=True by default

    sf_audit.append({"score": 0.92}, schema_key="halluccheck.score.v1")
    assert len(sf_audit.calls) == 1  # inspect recorded calls

    prri = sf_gate.evaluate_prri(prri_score=28.5)
    assert prri.allow  # GREEN by default

Override default returns per-method:

from spanforge.testing_mocks import MockSFPII

mock = MockSFPII()
mock.configure_response("scan_text", {"clean": False, "entities": ["EMAIL"]})
result = mock.scan_text("test")
assert not result["clean"]

Run spanforge doctor for a full environment diagnostic:

spanforge doctor
# ✅ Config valid
# ✅ All 11 services reachable
# ✅ API key not expired
# ✅ PII/secrets patterns loaded
# ✅ Gate YAML valid

Regulatory framework coverage

The ComplianceMappingEngine maps your telemetry events to specific regulatory clauses:

Framework	Clause	Mapped events	What it proves
GDPR	Art. 22	`consent.`, `hitl.`	Automated decisions have consent + human oversight
GDPR	Art. 25	`llm.redact.`, `consent.`	Privacy by design — PII handled before export
EU AI Act	Art. 13	`explanation.*`	AI decisions are transparent and explainable
EU AI Act	Art. 14	`hitl.`, `consent.`	Human oversight of high-risk AI
EU AI Act	Annex IV.5	`llm.guard.`, `llm.audit.`, `hitl.*`	Technical documentation — safety + oversight
SOC 2	CC6.1	`llm.audit.`, `llm.trace.`, `model_registry.*`	Logical access controls + model governance
NIST AI RMF	MAP 1.1	`llm.trace.`, `llm.eval.`, `model_registry.`, `explanation.`	Risk identification and mapping
HIPAA	§164.312	`llm.redact.`, `llm.audit.`	PHI access controls and audit
ISO 42001	A.5–A.10	Full event set	AI management system controls

Compliance event types

spanforge defines purpose-built event types for AI governance — these aren't afterthought log messages, they are first-class compliance primitives:

Category	Event types	Purpose
Consent	`consent.granted`, `consent.revoked`, `consent.violation`	Track user consent for automated processing
Human-in-the-Loop	`hitl.queued`, `hitl.reviewed`, `hitl.escalated`, `hitl.timeout`	Prove human oversight of AI decisions
Model Registry	`model_registry.registered`, `model_registry.deprecated`, `model_registry.retired`	Govern model lifecycle and risk
Explainability	`explanation.generated`	Attach explanations to AI decisions
Guardrails	`llm.guard.*`	Safety classifier outputs and block decisions
PII	`llm.redact.*`	Audit trail of what PII was found and removed
Audit	`llm.audit.*`	Access logs and chain-of-custody records
Traces	`llm.trace.*`	Model calls, tokens, latency, cost

Core capabilities

Tamper-proof audit chains

Every event is HMAC-SHA256 signed and chained to its predecessor — the same principle as certificate chains. Alter one event and the entire chain breaks.

from spanforge.signing import AuditStream, verify_chain

stream = AuditStream(org_secret="your-secret")
for event in events:
    stream.append(event)

result = verify_chain(stream.events, org_secret="your-secret")
assert result.valid  # any tampering → False

PII redaction

Strip personal data before events leave your application boundary. Deep scanning with Luhn and Verhoeff validation for credit cards and Aadhaar numbers, SSN range validation (_is_valid_ssn), calendar validation for dates of birth (_is_valid_date), and built-in patterns for date_of_birth and street address.

from spanforge.redact import RedactionPolicy, Sensitivity

policy = RedactionPolicy(min_sensitivity=Sensitivity.PII, redacted_by="policy:gdpr-v1")
result = policy.apply(event)
# All PII fields → "[REDACTED by policy:gdpr-v1]"

Model registry governance

Register models with ownership and risk metadata. Attestations automatically warn when models are deprecated, retired, or unregistered.

from spanforge.model_registry import ModelRegistry

registry = ModelRegistry()
registry.register("gpt-4o", owner="ml-platform", risk_tier="high")
registry.deprecate("gpt-3.5-turbo", reason="Successor available")

# Evidence packages now include:
#   model_owner: "ml-platform"
#   model_risk_tier: "high"
#   model_status: "active"
#   model_warnings: []  (or ["model 'gpt-3.5-turbo' is deprecated"])

Explainability tracking

Measure what percentage of your AI decisions have explanations attached:

from spanforge.explain import generate_explanation

explanation = generate_explanation(
    decision_event_id="evt_01HX...",
    method="feature_importance",
    content="Top factors: credit_score (0.42), income (0.31)...",
)
# explanation_coverage_pct in attestations = explained / total decisions

GDPR subject erasure

Right-to-erasure with tombstone events that preserve audit chain integrity:

spanforge audit erase audit.jsonl --subject-id user123

Auto-instrumentation

Patch supported providers once — compliance data flows automatically:

# Instrument all installed providers in one call
import spanforge.auto
spanforge.auto.setup()

# Or patch individually
from spanforge.integrations import openai as sf_openai
sf_openai.patch()    # every OpenAI call → signed, redacted, compliant
sf_openai.unpatch()  # restore original behaviour

Supported providers: OpenAI, Anthropic, Google Gemini, AWS Bedrock, Ollama, Groq, Together AI

Supported frameworks: LangChain, LlamaIndex, CrewAI

Using SpanForge alongside OpenTelemetry

spanforge is not an OTel replacement. OTel handles performance monitoring. spanforge adds the compliance layer OTel cannot provide — audit chains, PII redaction, consent tracking, and regulator-ready attestations.

# Your existing OTel pipeline stays untouched
from opentelemetry.sdk.trace import TracerProvider
provider = TracerProvider()

# Add spanforge's compliance layer alongside it
import spanforge
spanforge.configure(mode="otel_passthrough")

# Dual-stream: OTel for monitoring, spanforge for compliance
spanforge.configure(exporters=["otel_passthrough", "jsonl"], endpoint="audit.jsonl")

Export

Ship compliance events to any backend:

from spanforge.stream import EventStream
from spanforge.export.jsonl import JSONLExporter
from spanforge.export.otlp import OTLPExporter
from spanforge.export.datadog import DatadogExporter
from spanforge.export.grafana import GrafanaLokiExporter
from spanforge.export.cloud import CloudExporter
from spanforge.export.siem_splunk import SplunkHECExporter
from spanforge.export.siem_syslog import SyslogExporter

stream = EventStream(events)

await stream.drain(JSONLExporter("audit.jsonl"))                    # local file
await stream.drain(OTLPExporter("http://collector:4318/v1/traces")) # OTel collector
await stream.drain(DatadogExporter(service="my-app"))               # Datadog APM
await stream.drain(GrafanaLokiExporter(url="http://loki:3100"))     # Grafana Loki
await stream.drain(CloudExporter(api_key="sf_live_xxx"))            # spanforge Cloud
await stream.drain(SplunkHECExporter())                             # Splunk HEC (env-var config)
await stream.drain(SyslogExporter())                                # Syslog/CEF (env-var config)

Fan-out routing for compliance alerting:

from spanforge.export.webhook import WebhookExporter

# Route guardrail violations to Slack
await stream.route(
    WebhookExporter("https://hooks.slack.com/your-webhook"),
    predicate=lambda e: e.event_type == "llm.guard.output.blocked",
)

CLI

38 commands — all CI-pipeline ready:

# Compliance
spanforge compliance generate --model-id gpt-4o --framework eu_ai_act \
  --from 2026-01-01 --to 2026-03-31 --events-file events.jsonl
spanforge compliance check --framework eu_ai_act \
  --from 2026-01-01 --to 2026-03-31 --events-file events.jsonl
spanforge compliance validate-attestation evidence.json
spanforge compliance status --events-file events.jsonl   # compliance summary JSON

# Audit chain
spanforge audit-chain events.jsonl             # verify chain integrity
spanforge audit erase events.jsonl --subject-id user123  # GDPR erasure
spanforge audit rotate-key events.jsonl        # key rotation
spanforge audit verify --input events.jsonl    # verify integrity
spanforge audit extract events.jsonl --type llm.trace.span.completed --since 2026-01-01  # filter & extract
spanforge audit cec generate --project-id my-agent --sign  # CEC compliance bundle ZIP
spanforge audit gap-finder events.jsonl --threshold-minutes 30  # detect time gaps + missing fields

# Privacy & Secrets
spanforge scan events.jsonl --fail-on-match    # CI-gate PII scan
spanforge secrets scan <file>                  # scan file for secrets (exit 0=clean, 1=found)
spanforge secrets scan <file> --format sarif   # SARIF output for GitHub Code Scanning
spanforge secrets scan <file> --redact         # print redacted version to stdout

# Event generation
spanforge event create --type llm.trace.span.completed --count 10 --format jsonl  # generate test events

# Validation
spanforge check                                # 9-step end-to-end health check (--verbose for timing)
spanforge check-compat events.json             # v2.0 compatibility
spanforge validate events.jsonl                # JSON Schema validation
spanforge validate events.jsonl --report detailed --format json  # detailed report
spanforge validate --dataset training.jsonl                    # scan JSONL training data for PII
spanforge validate --dataset training.jsonl --fail-on-violations  # exit 1 if PII/schema issues found
spanforge validate --dataset training.jsonl --required-fields prompt,response --format json  # required fields + JSON output

# Configuration
spanforge config validate                      # validate .halluccheck.toml (auto-discover)
spanforge config validate --file path/to.toml  # validate specific config file

# Analysis
spanforge stats events.jsonl                   # counts, tokens, cost
spanforge stats events.jsonl --group-by model --format json  # grouped stats, JSON output
spanforge inspect <EVENT_ID> events.jsonl      # pretty-print one event
spanforge inspect <EVENT_ID> events.jsonl --format csv  # CSV export
spanforge cost events.jsonl                    # token spend report
spanforge cost run --run-id <id> --input events.jsonl  # per-run cost report

# Evaluation
spanforge eval save --input events.jsonl --output dataset.jsonl  # extract eval dataset
spanforge eval run --file dataset.jsonl --scorers faithfulness,pii_leakage  # run scorers

# Migration
spanforge migrate events.jsonl --sign          # v1→v2 migration
spanforge migrate-langsmith export.jsonl       # LangSmith → SpanForge conversion
spanforge list-deprecated                      # deprecated event types
spanforge migration-roadmap                    # v2 migration plan
spanforge check-consumers                      # consumer compatibility

# CI/CD Gate Pipeline
spanforge gate run gates/ci-pipeline.yaml               # run YAML gate pipeline (exit 1 = blocking gate failed)
spanforge gate run gates/ci-pipeline.yaml --format json  # JSON output for CI dashboards
spanforge gate evaluate schema-validation --payload event.json  # evaluate single gate
spanforge gate trust-gate --project-id my-agent         # composite trust gate check
spanforge gate audit events.jsonl --fail-on-violation   # policy audit of gate records (CI gate)

# T.R.U.S.T. Scorecard
spanforge trust scorecard --project-id my-agent         # five-pillar trust scorecard (text table)
spanforge trust badge --project-id my-agent             # SVG badge to stdout
spanforge trust gate --project-id my-agent              # composite trust gate (exit 1 = below threshold)

# Enterprise (Phase 11)
spanforge enterprise status                            # enterprise subsystem status JSON
spanforge enterprise health                            # enterprise health check (all services)

# Security (Phase 11)
spanforge security owasp                               # OWASP API Security Top 10 audit
spanforge security scan                                # full security scan (deps + static + secrets-in-logs)
spanforge security threat-model                        # STRIDE threat model summary
spanforge security audit-logs --path /var/log/myapp/   # secrets-in-logs detection

# Developer Experience (Phase 12)
spanforge doctor                                       # environment diagnostics (config, services, keys, patterns)

# Viewer
spanforge serve                                # local SPA trace viewer
spanforge ui                                   # standalone HTML viewer

Event namespaces

Every event carries a typed payload. The built-in namespaces:

Prefix	Dataclass	What it records
`consent.*`	`ConsentPayload`	User consent grants, revocations, violations
`hitl.*`	`HITLPayload`	Human-in-the-loop review, escalation, timeout
`model_registry.*`	`ModelRegistryEntry`	Model registration, deprecation, retirement
`explanation.*`	`ExplainabilityRecord`	Explainability records for AI decisions
`llm.trace.*`	`SpanPayload`	Model calls — tokens, latency, cost (frozen v2)
`llm.guard.*`	`GuardPayload`	Safety classifier outputs, block decisions
`llm.redact.*`	`RedactPayload`	PII audit — what was found and removed
`llm.audit.*`	`AuditChainPayload`	Access logs and chain-of-custody
`llm.eval.*`	`EvalScenarioPayload`	Scores, labels, evaluator identity
`llm.cost.*`	`CostPayload`	Per-call cost in USD
`llm.cache.*`	`CachePayload`	Cache hit/miss, backend, TTL
`llm.prompt.*`	`PromptPayload`	Prompt template version, rendered text
`llm.fence.*`	`FencePayload`	Topic constraints, allow/block lists
`llm.diff.*`	`DiffPayload`	Prompt/response delta between events
`llm.template.*`	`TemplatePayload`	Template registry metadata

Architecture

spanforge/
+-- core/
│   +-- compliance_mapping.py  — ComplianceMappingEngine, evidence packages, attestations
+-- compliance/                — Programmatic compliance test suite
+-- signing.py                 — HMAC audit chains, key management, multi-tenant KeyResolver
+-- redact.py                  — PII detection + redaction policies
+-- model_registry.py          — Model lifecycle governance
+-- explain.py                 — Explainability records
+-- consent.py                 — Consent boundary events
+-- hitl.py                    — Human-in-the-loop events
+-- governance.py              — Policy-based event gating
+-- event.py                   — Event envelope
+-- types.py                   — EventType enum (consent.*, hitl.*, model_registry.*, explanation.*, llm.*)
+-- config.py                  — configure() / get_config()
+-- _span.py                   — Span, AgentRun, AgentStep context managers
+-- _trace.py                  — Trace + start_trace()
+-- _tracer.py                 — Top-level tracing entry point
+-- _stream.py                 — Internal dispatch: sample — redact — sign — export
+-- _store.py                  — TraceStore ring buffer
+-- _hooks.py                  — HookRegistry (lifecycle hooks)
+-- _server.py                 — HTTP server (/traces, /compliance/summary)
+-- _cli.py                    ← 38 CLI sub-commands
+-- workflow.py                — Human-in-the-Loop Workflow Engine (CORE-15); WorkflowEngine, WorkflowType, state machine, SLA escalation
+-- cost.py                    — CostTracker, BudgetMonitor, @budget_alert
+-- cache.py                   — SemanticCache, @cached decorator
+-- retry.py                   — @retry, FallbackChain, CircuitBreaker
+-- toolsmith.py               — @tool, ToolRegistry
+-- http.py                    — Zero-dependency OpenAI-compatible HTTP client
+-- io.py                      — JSONL read/write/append utilities
+-- plugins.py                 — Entry-point plugin discovery
+-- schema.py                  — Lightweight zero-dependency JSON Schema validator
+-- regression.py              — Pass/fail regression detector
+-- stats.py                   — Percentile, latency summary utilities
+-- presidio_backend.py        — Optional Presidio-powered PII detection
+-- _ansi.py                   — ANSI color helpers (NO_COLOR aware)
+-- lint/                      — AST-based instrumentation linter (AO000–AO005)
+-- export/                    — JSONL, OTLP, Webhook, Datadog, Grafana Loki, Cloud, Redis, Splunk HEC, Syslog/CEF
+-- integrations/              — OpenAI, Anthropic, Gemini, Bedrock, LangChain, LlamaIndex, CrewAI, Ollama, Groq, Together
+-- namespaces/                — Typed payload dataclasses
+-- gate.py                    — GateRunner YAML pipeline engine, 6 gate executors, artifact store (Phase 8)
+-- sdk/                       — Service SDK clients (sf-identity, sf-pii, sf-secrets, sf-audit, sf-cec, sf-observe, sf-alert, sf-gate, sf-trust, sf-enterprise, sf-security)
│   +-- explain.py             —   SFExplainClient – ExplainModelType enum (LLM/RAG/MULTI_AGENT/CLASSIFIER/EMBEDDING), signed explanations, retry+timeout emit (Phase 1B)
│   +-- scope.py               —   SFScopeClient – ACTION_CATEGORIES (5 categories), circuit-breaker fail-secure, resolve_action_category() (Phase 1B)
│   +-- rbac.py                —   SFRBACClient – STANDARD_ROLE_MATRIX (10 actor types), register_actor_from_yaml(), register_actor_from_jwt() (Phase 1C)
│   +-- identity.py            —   SFIdentityClient – keys, JWT, TOTP, MFA, magic-link
│   +-- pii.py                 —   SFPIIClient – scan, redact, anonymize
│   +-- secrets.py             —   SFSecretsClient – 20-pattern secret scanning, SARIF output
│   +-- audit.py               —   SFAuditClient – HMAC-chained records, T.R.U.S.T. scorecard, Article 30, BYOS
│   +-- cec.py                 —   SFCECClient – signed CEC ZIP bundles, clause mapping, DPA generation (Phase 5)
│   +-- observe.py             —   SFObserveClient – span export, OTel GenAI attrs, W3C TraceContext, sampling (Phase 6)
│   +-- alert.py               —   SFAlertClient – topic-based routing, dedup, escalation policy, 6 sink integrations (Phase 7)
│   +-- gate.py                —   SFGateClient – YAML pipeline runner, evaluate(), evaluate_prri(), trust-gate, artifact management (Phase 8)
│   +-- config.py              —   .halluccheck.toml parser, SFConfigBlock, SFServiceToggles, SFLocalFallbackConfig, validate_config() (Phase 9)
│   +-- registry.py            —   ServiceRegistry singleton, health checks, background checker, status_response() (Phase 9)
│   +-- fallback.py            —   8 local fallback implementations: pii, secrets, audit, observe, alert, identity, gate, cec (Phase 9)
│   +-- trust.py               —   SFTrustClient – T.R.U.S.T. five-pillar scorecard, SVG badge, history time-series, configurable weights (Phase 10)
│   +-- pipelines.py           —   5 HallucCheck pipeline integrations: score, bias, monitor, risk, benchmark (Phase 10)
│   +-- enterprise.py          —   SFEnterpriseClient – multi-tenancy, encryption, air-gap, health probes (Phase 11)
│   +-- security.py            —   SFSecurityClient – OWASP audit, STRIDE threat model, dependency/static scanning, secrets-in-logs (Phase 11)
│   +-- testing_mocks.py       —   11 mock service clients, _MockBase, mock_all_services() context manager (Phase 12)
│   +-- _base.py               —   SFClientConfig, SFServiceClient, circuit breaker, sandbox mode (Phase 12)
│   +-- _types.py              —   SecretStr, APIKeyBundle, JWTClaims, BundleResult, ClauseMapEntry, ExportResult, Annotation, AlertSeverity, …
│   +-- _exceptions.py         —   SFError hierarchy (incl. SFConfigError, SFConfigValidationError, SFStartupError, SFServiceUnavailableError, SFTrustComputeError, SFPipelineError)
│   +-- __init__.py            —   sf_identity / sf_pii / sf_secrets / sf_audit / sf_cec / sf_observe / sf_alert / sf_gate / sf_trust / sf_enterprise / sf_security / sf_rag / sf_feedback singletons + configure()
+-- migrate.py                 — Schema migration (v1 — v2), LangSmith migration

What is inside the box

Module	What it does	For whom
Compliance & Governance
`spanforge.compliance`	`ComplianceMappingEngine` maps telemetry to regulatory frameworks (EU AI Act, ISO 42001, NIST AI RMF, GDPR, SOC 2, HIPAA). Generates evidence packages with HMAC-signed attestations. Consent, HITL, model registry, and explainability events integrated into clause mappings. Attestations include model owner, risk tier, status, warnings, and `explanation_coverage_pct`. Also: programmatic v2.0 compatibility checks — no pytest required.	Compliance / legal / platform teams
`spanforge.signing`	HMAC-SHA256 event signing, tamper-evident audit chains, key strength validation, key expiry checks, environment-isolated key derivation, multi-tenant `KeyResolver` protocol, and `AsyncAuditStream`	Security / compliance teams
`spanforge.redact`	PII detection, sensitivity levels, redaction policies, deep `scan_payload()` with Luhn / Verhoeff / SSN-range / date-calendar validation, built-in `date_of_birth` and `address` patterns, and `contains_pii()` / `assert_redacted()` with raw string scanning	Data privacy / GDPR teams
`spanforge.secrets`	`SecretsScanner` — 20-pattern registry (7 spec-defined + 13 industry-standard), Shannon entropy scoring, three-tier confidence model, zero-tolerance auto-block for 10 high-risk types, `SecretsScanResult` with `to_dict()` and SARIF 2.1.0 output, span deduplication, configurable allowlist	Security / DevSecOps teams
`spanforge.governance`	Policy-based event gating — block prohibited types, warn on deprecated usage, enforce custom rules	Platform / compliance teams
Instrumentation & Tracing
`spanforge.event`	The core `Event` envelope — the one structure all tools share	Everyone
`spanforge.types`	All built-in event types — compliance events (`consent.`, `hitl.`, `model_registry.`, `explanation.`) and telemetry events (`llm.trace.`, `llm.guard.`, etc.)	Everyone
`spanforge._span`	Span, AgentRun, AgentStep context managers. `contextvars`-based async/thread-safe propagation. `async with`, `span.add_event()`, `span.set_timeout_deadline()`	App developers
`spanforge._trace`	`Trace` + `start_trace()` — high-level tracing entry point; accumulates child spans	App developers
`spanforge.config`	`configure()` and `get_config()` — signing key, redaction policy, exporters, sample rate	Everyone
Export & Integration
`spanforge.export`	Ship events to JSONL, HTTP webhooks, OTLP collectors, Datadog APM, Grafana Loki, Splunk HEC, Syslog/CEF, Redis, or spanforge Cloud	Infra / compliance teams
`spanforge.export.siem_splunk`	`SplunkHECExporter` — thread-safe batched Splunk HTTP Event Collector exporter; env-var config; HEC token never logged; `SplunkHECError` on delivery failure	Security / compliance teams
`spanforge.export.siem_syslog`	`SyslogExporter` — RFC 5424 and ArcSight CEF exporter over UDP or TCP; severity derived from event type; CEF extension values properly escaped; `SyslogExporterError` on socket failure	Security / compliance teams
`spanforge.stream`	Fan-out router — one `drain()` call reaches multiple backends; Kafka source	Platform engineers
`spanforge.integrations`	Auto-instrumentation for OpenAI, Anthropic, LangChain, LlamaIndex, CrewAI, Groq, Ollama, Together	App developers
`spanforge.auto`	`setup()` auto-patches all installed LLM integrations; `teardown()` cleanly unpatches	App developers
Developer Tools
`spanforge.cost`	`CostTracker`, `BudgetMonitor`, `@budget_alert` — track and alert on token spend	App developers / FinOps
`spanforge.cache`	`SemanticCache` + `@cached` — deduplicate LLM calls via cosine similarity; `InMemoryBackend`, `SQLiteBackend`, `RedisBackend`	App developers / FinOps
`spanforge.retry`	`@retry`, `FallbackChain`, `CircuitBreaker`, `CostAwareRouter` — resilient LLM routing with compliance events	App developers / SREs
`spanforge.toolsmith`	`@tool` + `ToolRegistry` — register functions as typed tools; render JSON schemas for function-calling APIs	App developers
`spanforge.lint`	AST-based instrumentation linter; AO001–AO005 codes; flake8 plugin; CLI	All teams / CI
Utilities (v2.0.2+)
`spanforge.http`	`chat_completion()` — zero-dependency, synchronous OpenAI-compatible HTTP client with retry and back-off	App developers
`spanforge.io`	`read_jsonl()`, `write_jsonl()`, `append_jsonl()`, `write_events()`, `read_events()` — JSONL I/O utilities	Everyone
`spanforge.schema`	Lightweight zero-dependency JSON Schema validator — `validate()`, `validate_strict()`	Tool authors / CI
`spanforge.regression`	`RegressionDetector` — per-case pass/fail regression detection between baseline and current eval runs	ML / eval teams
`spanforge.stats`	`percentile()`, `latency_summary()` — statistical utilities for eval and performance analysis	Analytics engineers
`spanforge.plugins`	`discover(group)` — entry-point plugin discovery across Python 3.9–3.12+	Plugin authors
`spanforge.presidio_backend`	Optional Presidio-powered PII detection backend — `presidio_scan_payload()` with standard `PIIScanResult`	Data privacy teams
`spanforge.eval`	Built-in scorers: `FaithfulnessScorer`, `RefusalDetectionScorer`, `PIILeakageScorer`, `BehaviourScorer` base class	ML / eval teams
`spanforge.debug`	`print_tree()`, `summary()`, `visualize()` — terminal tree, stats dict, HTML Gantt timeline	App developers
`spanforge.metrics`	`aggregate()` — success rates, latency percentiles, token totals, cost breakdowns	Analytics engineers
`spanforge.testing`	`MockExporter`, `capture_events()`, `assert_event_schema_valid()`, `trace_store()`	Test authors
`spanforge.testing_mocks`	11 drop-in mock service clients (`MockSFIdentity`, `MockSFPII`, `MockSFSecrets`, `MockSFAudit`, `MockSFObserve`, `MockSFGate`, `MockSFCEC`, `MockSFAlert`, `MockSFTrust`, `MockSFEnterprise`, `MockSFSecurity`). `mock_all_services()` context manager patches all 11 singletons. `_MockBase` with `.calls` recording and `.configure_response()`. 100% test coverage. (Phase 12, v2.0.11+)	Test authors / all teams
`spanforge.validate`	JSON Schema validation against the published v2.0 schema	All teams
`spanforge.namespaces`	Typed payload dataclasses for all built-in event namespaces	Tool authors
`spanforge.models`	Optional Pydantic v2 models for validated schemas	API / backend teams
`spanforge.consumer`	Declare schema-namespace dependencies; fail fast at startup if version requirements are not met	Platform teams
`spanforge.deprecations`	Per-event-type deprecation notices at runtime	Library maintainers
`spanforge._hooks`	Lifecycle hooks: `@hooks.on_llm_call`, `@hooks.on_tool_call`, `@hooks.on_agent_start` (sync + async)	App developers / platform
`spanforge._store`	`TraceStore` ring buffer — `get_trace()`, `list_tool_calls()`, `list_llm_calls()`	Platform / tooling engineers
`spanforge._cli`	CLI sub-commands including eval, compliance status, migrate-langsmith, cost run, and more	DevOps / CI teams
Service SDK (v2.0.3+)
`spanforge.sdk.identity`	`SFIdentityClient` — API key lifecycle (`issue_api_key`, `rotate_api_key`, `revoke_api_key`), session JWT (HS256 stdlib / RS256 remote), magic-link issuance + single-use exchange, TOTP enrolment + verification (RFC 6238, 6-digit, 30 s), backup codes, per-key IP allowlist, sliding-window rate limiting, brute-force lockout. Fully local-mode capable — no external service required.	Security / platform teams
`spanforge.sdk.pii`	`SFPIIClient` — `scan_text()`, `anonymise()`, `scan_batch()`, `apply_pipeline_action()`, `get_status()`, `erase_subject()` (GDPR Art. 17), `export_subject_data()` (CCPA DSAR), `safe_harbor_deidentify()` (HIPAA 18-PHI), `audit_training_data()` (EU AI Act Art. 10), `get_pii_stats()`. PIPL patterns for Chinese national ID / mobile / bank card. Pipeline action routing (`flag` / `redact` / `block`) with confidence threshold gate. Scan results never include raw PII — only type labels, field paths, and SHA-256 hashes. Runs locally or delegates to a remote sf-pii service.	Data privacy / GDPR teams
`spanforge.sdk.secrets`	`SFSecretsClient` — `scan(text)` → `SecretsScanResult`, `scan_batch(texts)` with asyncio parallel execution. 20-pattern registry covering all spec-required types plus 13 industry-standard additions. Three-tier confidence model (0.75 / 0.90 / 0.97). Zero-tolerance auto-block for 10 high-risk secret types. SARIF 2.1.0 output. Runs fully locally — no external service required.	Security / DevSecOps teams
`spanforge.sdk.audit`	`SFAuditClient` — `append(record, schema_key)` with HMAC-SHA256 chaining, `query()` SQLite index with full-text and date-range filters, `verify_chain()` tamper detection, `get_trust_scorecard()` T.R.U.S.T. dimensions (hallucination · PII hygiene · secrets hygiene · gate pass-rate · compliance posture), `generate_article30_record()` GDPR Article 30 RoPA, `export()` JSONL/CSV/compressed, `sign()`, `get_status()`. BYOS routing via `SPANFORGE_AUDIT_BYOS_PROVIDER` (S3 / Azure / GCS / R2). Strict-schema mode, configurable retention years, optional SQLite persistence. 123 tests, 85 % coverage, mypy strict clean.	Compliance / security / audit teams
`spanforge.sdk.observe`	`SFObserveClient` — `emit_span(name, attributes)` builds OTel-compliant spans with W3C traceparent / baggage injection and OTel GenAI semantic attributes; `export_spans(spans, receiver_config=...)` routes to `local` / `otlp` / `datadog` / `grafana` / `splunk` / `elastic`; `add_annotation(event_type, payload)` / `get_annotations(event_type, from_dt, to_dt)` annotation store; `get_status()`, `healthy`, `last_export_at` health probes. Sampling via `SPANFORGE_OBSERVE_SAMPLER` (`always_on` / `always_off` / `parent_based` / `trace_id_ratio`). 139 tests, 97% coverage, mypy strict + bandit clean. (Phase 6, v2.0.5+)	Platform / MLOps / observability teams
`spanforge.sdk.alert`	`SFAlertClient` — `publish(topic, payload, , severity, project_id) → PublishResult` routes to all configured sinks with deduplication, rate-limiting, alert grouping, and maintenance-window suppression; `acknowledge(alert_id)` cancels CRITICAL escalation; `register_topic()` custom topic registry; `set_maintenance_window()` / `remove_maintenance_windows()`; `get_alert_history()` with filtering; `get_status()` / `healthy` health probes. Built-in sinks: `WebhookAlerter` (HMAC), `OpsGenieAlerter`, `VictorOpsAlerter`, `IncidentIOAlerter`, `SMSAlerter` (Twilio), `TeamsAdaptiveCardAlerter`. Auto-discovery from `SPANFORGE_ALERT_` env vars. Per-sink circuit breakers. 95 tests, mypy strict + bandit clean. (Phase 7, v2.0.6+)	Platform / SRE / on-call teams
`spanforge.sdk.gate`	`SFGateClient` — `evaluate(gate_id, payload) → GateEvaluationResult`, `evaluate_prri(prri_score) → PRRIResult`, `run_pipeline(gate_config_path) → GateRunResult`, `get_artifact(gate_id)`, `list_artifacts()`, `purge_artifacts(older_than_days)`, `get_status() → GateStatusInfo`, `configure(config)`. Six built-in gate executors: `schema_validation`, `dependency_security`, `secrets_scan`, `performance_regression`, `halluccheck_prri`, `halluccheck_trust`. PRRI three-tier verdict (`GREEN`/`AMBER`/`RED`), `GateArtifact` store with configurable retention, composite trust gate (HRI rate + PII window + secrets window), five exception types. 174 tests, mypy strict + bandit clean. (Phase 8, v2.0.7+)	DevOps / CI / platform teams
`spanforge.sdk.config`	`load_config_file(path?)` — auto-discovers `.halluccheck.toml` or falls back to env-var defaults. `validate_config(block)` / `validate_config_strict(block)` schema validation. `SFConfigBlock`, `SFServiceToggles`, `SFLocalFallbackConfig`, `SFPIIConfig`, `SFSecretsConfig` typed dataclasses. Env-var overrides: `SPANFORGE_ENDPOINT`, `SPANFORGE_API_KEY`, `SPANFORGE_PROJECT_ID`, `SPANFORGE_PII_THRESHOLD`, `SPANFORGE_SECRETS_AUTO_BLOCK`, `SPANFORGE_LOCAL_TOKEN`, `SPANFORGE_FALLBACK_TIMEOUT_MS`. (Phase 9, v2.0.8+)	All teams / platform engineers
`spanforge.sdk.registry`	`ServiceRegistry.get_instance()` — thread-safe singleton holding all 11 service clients. `run_startup_check()` pings all enabled services (status: up / degraded / down). `status_response()` returns per-service `{status, latency_ms, last_checked_at}`. `start_background_checker()` launches a daemon thread re-checking every 60 s. `ServiceHealth`, `ServiceStatus` typed enums. (Phase 9, v2.0.8+)	Platform / SRE teams
`spanforge.sdk.fallback`	8 local-mode fallback implementations: `pii_fallback()` (regex scan), `secrets_fallback()` (regex scan), `audit_fallback()` (HMAC-chained JSONL), `observe_fallback()` (OTLP JSON to stdout), `alert_fallback()` (log to stderr), `identity_fallback()` (trust local token), `gate_fallback()` (local gate engine), `cec_fallback()` (local JSONL). All emit WARNING when active. (Phase 9, v2.0.8+)	All teams (automatic)
`spanforge.sdk.trust`	`SFTrustClient` — `get_scorecard(project_id, , from_dt, to_dt, weights) → TrustScorecardResponse` aggregates five T.R.U.S.T. dimensions (Transparency · Reliability · UserTrust · Security · Traceability) with configurable weights. `get_badge(project_id) → TrustBadgeResult` generates an SVG badge with colour-band (green ≥ 80, amber ≥ 60, red < 60). `get_history(project_id, , buckets) → list[TrustHistoryEntry]` returns time-series snapshots. `get_status()` health probe. Reads from sf-audit trust records. 28 tests, mypy strict + bandit clean. (Phase 10, v2.0.9+)	Compliance / platform / ML teams
`spanforge.sdk.pipelines`	5 HallucCheck ↔ SpanForge pipeline integrations: `score_pipeline(text)` (PII → secrets → observe → audit), `bias_pipeline(report)` (PII → audit → alert → anonymise), `monitor_pipeline(event)` (observe → alert → OTel export), `risk_pipeline(prri_score)` (PRRI → alert → gate → CEC), `benchmark_pipeline(results)` (audit → alert → anonymise). Each returns `PipelineResult` with audit trail. (Phase 10, v2.0.9+)	ML / eval / platform teams
`SFCECClient` — `build_bundle(project_id, date_range, frameworks)` assembles a signed ZIP with `manifest.json`, `clause_map.json`, `chain_proof.json`, `attestation.json`, `rfc3161_timestamp.tsr`, and 6 NDJSON evidence directories. HMAC-SHA256 manifest signing, BYOS detection. `verify_bundle(zip_path)` re-verifies HMAC + chain + timestamp. `generate_dpa(project_id, controller_details, processor_details)` produces a GDPR Article 28 Data Processing Agreement. `get_status()` returns bundle count, BYOS provider, and last bundle timestamp. Supports all 5 frameworks: `eu_ai_act`, `iso_42001`, `nist_ai_rmf`, `iso27001`, `soc2`. 148 tests, 87% coverage, mypy strict + bandit clean. (Phase 5, v2.0.4+)	Compliance / legal / audit teams
`spanforge.sdk`	Pre-built `sf_identity`, `sf_pii`, `sf_secrets`, `sf_audit`, `sf_cec`, `sf_observe`, `sf_alert`, `sf_gate`, `sf_trust`, `sf_enterprise`, `sf_security`, `sf_rag`, and `sf_feedback` singletons loaded from env vars on first import. `SFClientConfig`, `SecretStr`, full exception hierarchy (`SFAuthError`, `SFBruteForceLockedError`, `SFPIINotRedactedError`, `SFPIIBlockedError`, `SFPIIDPDPConsentMissingError`, `SFSecretsBlockedError`, `SFAuditSchemaError`, `SFAuditAppendError`, `SFAuditQueryError`, `SFCECError`, `SFCECBuildError`, `SFCECVerifyError`, `SFCECExportError`, `SFObserveError`, `SFObserveExportError`, `SFObserveEmitError`, `SFObserveAnnotationError`, `SFAlertError`, `SFAlertPublishError`, `SFAlertRateLimitedError`, `SFAlertQueueFullError`, `SFGateError`, `SFGateEvaluationError`, `SFGatePipelineError`, `SFGateTrustFailedError`, `SFGateSchemaError`, `SFConfigError`, `SFConfigValidationError`, `SFStartupError`, `SFServiceUnavailableError`, `SFTrustComputeError`, `SFPipelineError`, `SFEnterpriseError`, `SFIsolationError`, `SFDataResidencyError`, `SFEncryptionError`, `SFFIPSError`, `SFAirGapError`, `SFSecurityScanError`, `SFSecretsInLogsError`, …), and all value-object types exported from the top-level package. `load_config_file()`, `validate_config()`, `validate_config_strict()`, `ServiceRegistry`, and 8 fallback functions re-exported for convenience.	All teams

Quality

5 863 tests passing (14 skipped) — unit, integration, property-based (Hypothesis), performance benchmarks
≥ 91% line and branch coverage — 90% minimum enforced in CI
Zero required dependencies — entire core runs on Python stdlib
Typed — full py.typed marker; mypy + pyright clean
Frozen v2 trace schema — llm.trace.* payload fields never break between minor releases
Async-safe — contextvars-based context propagation across asyncio, threads, and executors

Development

git clone https://github.com/veerarag1973/spanforge.git
cd spanforge
python -m venv .venv && .venv\Scripts\activate
pip install -e ".[dev]"
pytest                      # 5 351 tests

Code quality

ruff check . && ruff format .
mypy spanforge
pytest --cov                # >=90% required

Build docs

pip install -e ".[docs]"
cd docs && sphinx-build -b html . _build/html

Versioning

spanforge implements RFC-0001 (AI Compliance Standard for Agentic AI Systems). Current schema version: 2.0.

This project follows Semantic Versioning. The llm.trace.* namespace is additionally frozen at v2 — even major releases won't remove fields from SpanPayload, AgentRunPayload, or AgentStepPayload.

See docs/changelog.md for the full version history.

Contributing

Contributions welcome — see the Contributing Guide. All new code must maintain ≥ 90% coverage. Run ruff and mypy before submitting.

Community

Discussions — questions, ideas, show-and-tell
Issues — bug reports and feature requests
SECURITY.md — responsible disclosure process
Code of Conduct — Contributor Covenant v2.1

Topics: ai-compliance ai-governance eu-ai-act gdpr soc2 audit-trail pii-redaction hmac-signing llm-governance python

License

PolyForm Noncommercial License 1.0.0

✅ Free for personal use, research, education, open-source projects, and non-profit organisations.
❌ Commercial use (running as a paid service, internal business use, SaaS integration) requires a commercial license.

To obtain a commercial license: sriram@getspanforge.com | getspanforge.com/pricing

Enterprise features (SSO, air-gapped deployment, dedicated support, SLAs) are available in SpanForge Enterprise — a separate commercial product.

Built for teams that take AI governance seriously.
Docs — Runtime Governance — Quickstart — API Reference — Discussions — Report a bug

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

1.0.4

May 19, 2026

1.0.3

May 10, 2026

1.0.2

May 10, 2026

This version

1.0.1

May 2, 2026

1.0.0

May 1, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

spanforge-1.0.1.tar.gz (1.5 MB view details)

Uploaded May 2, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

spanforge-1.0.1-py3-none-any.whl (747.3 kB view details)

Uploaded May 2, 2026 Python 3

File details

Details for the file spanforge-1.0.1.tar.gz.

File metadata

Download URL: spanforge-1.0.1.tar.gz
Upload date: May 2, 2026
Size: 1.5 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for spanforge-1.0.1.tar.gz
Algorithm	Hash digest
SHA256	`6e1a6bcf0c48d357602a9f117e5b32b0e71b2e3be5d5f832228017a455387c74`
MD5	`40dbdb2141a9339da40a21663d88b615`
BLAKE2b-256	`d136940b5d34b98e9286216b90b9754d822df54abcfab0d6df3065f0d96fa913`

See more details on using hashes here.

File details

Details for the file spanforge-1.0.1-py3-none-any.whl.

File metadata

Download URL: spanforge-1.0.1-py3-none-any.whl
Upload date: May 2, 2026
Size: 747.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for spanforge-1.0.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`cf505ce3877cfb5427cfeab794263aebacca3e5b806bd672fcfc46e1e3935dd9`
MD5	`00da540bc24b94d551454d77e79bd41a`
BLAKE2b-256	`346c30bb3156c8dfe27b2dc7dc67931c010d92c7a606b1d3927aba44bf2acf19`

See more details on using hashes here.

spanforge 1.0.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

spanforge

The problem

If you're a solo developer or early-stage startup

What spanforge does

Compliance & Regulatory Mapping

Governance & Controls

Developer Experience

How it compares

Install

Optional extras

Runtime Governance GA Surface

Quick start — compliance in 5 minutes

1. Configure and instrument

2. Trace AI decisions

3. Generate compliance evidence

4. Hand to your auditor

5. Package for auditors with sf-cec (v2.0.4+)

6. Observe spans with sf-observe (v2.0.5+)

7. Route alerts with sf-alert (v2.0.6+)

8. Enforce release gates with sf-gate (v2.0.7+)

9. Unified config & local fallback (v2.0.8+)

10. T.R.U.S.T. Scorecard & HallucCheck pipelines (v2.0.9+)

11. Test with zero-network mocks (v2.0.11+)

Regulatory framework coverage

Compliance event types

Core capabilities

Tamper-proof audit chains

PII redaction

Model registry governance

Explainability tracking

GDPR subject erasure

Auto-instrumentation

Using SpanForge alongside OpenTelemetry

Export

CLI

Event namespaces

Architecture

What is inside the box

Quality

Development

Versioning

Contributing

Community

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes