SpanForge — AI lifecycle and governance platform (RFC-0001 SPANFORGE)
Project description
spanforge
The AI Compliance Platform for Agentic Systems.
Ship AI applications that are auditable, regulator-ready, and privacy-safe — from day one.
Built on RFC-0001 — the SpanForge AI Compliance Standard for agentic AI systems.
The problem
You're building AI applications in a world where regulators are catching up fast. The EU AI Act is in force. GDPR applies to every LLM that touches personal data. SOC 2 auditors want evidence that your AI systems are governed. And your team is stitching together ad-hoc logs, hoping they'll hold up in an audit.
spanforge solves this. It is a compliance-first platform — not a monitoring add-on — that gives every AI action in your stack a cryptographically signed, privacy-safe, regulator-ready record.
If you're a solo developer or early-stage startup
You might think compliance is a later problem — something to worry about when you have a legal team. Here's why it isn't:
- You'll hit it sooner than you think. The first B2B customer, the first SaaS sign-up from an EU user, the first healthcare or fintech pilot — they'll ask "how do you govern your AI?" If you have no answer, you lose the deal.
- Retrofitting is expensive. Adding audit trails, PII scrubbing, and signed evidence chains to an existing system takes weeks. Adding them with spanforge from day one takes minutes.
- It's zero-cost to start. The entire SDK is free for noncommercial use, zero dependencies, and works in-memory with no infrastructure. You don't pay anything until you need hosted storage.
- It de-risks you personally. GDPR fines apply to individuals running services, not just corporations. PII redaction and tamper-proof logs are your protection too.
In short: spanforge is the logging import you should have added on day one — except it also signs your audit trail and maps it to the regulations that will eventually matter to you.
pip install spanforge # free for noncommercial use, zero deps
import spanforge
spanforge.configure() # that's it — you're now compliant-by-default
What spanforge does
Compliance & Regulatory Mapping
|
Privacy & Audit Infrastructure- Secrets scanning — 20-pattern registry detects API keys, tokens, private keys; SARIF output; pre-commit hook- PII redaction — detect and strip sensitive data before it leaves your app. Includes a Presidio NLP backend (
|
Governance & Controls
|
Developer Experience
|
How it compares
spanforge is the only open-standard, zero-dependency AI compliance platform. Other tools are monitoring platforms that bolt on compliance as an afterthought. spanforge is compliance infrastructure that happens to capture the telemetry needed to prove it.
| Capability | spanforge | LangSmith | Langfuse | OpenLLMetry | Arize Phoenix |
|---|---|---|---|---|---|
| Regulatory framework mapping (EU AI Act, GDPR, SOC 2…) | ✅ | ❌ | ❌ | ❌ | ❌ |
| HMAC-signed evidence packages & attestations | ✅ | ❌ | ❌ | ❌ | ❌ |
| Consent boundary monitoring | ✅ | ❌ | ❌ | ❌ | ❌ |
| Human-in-the-loop compliance events | ✅ | ❌ | ❌ | ❌ | ❌ |
| Model registry with risk-tier governance | ✅ | ❌ | ❌ | ❌ | ❌ |
| Explainability coverage metrics | ✅ | ❌ | ❌ | ❌ | ❌ |
| Built-in PII redaction | ✅ | ❌ | ❌ | ❌ | ❌ |
| Tamper-proof audit chain | ✅ | ❌ | ❌ | ❌ | ❌ |
| GDPR subject erasure (right-to-erasure) | ✅ | ❌ | ❌ | ❌ | ❌ |
| Works fully offline / air-gapped | ✅ | ❌ | Self-host | Partial | Self-host |
| Open schema standard (RFC-driven) | ✅ | ❌ | ❌ | Partial | ❌ |
| Zero required dependencies | ✅ | ❌ | ❌ | ❌ | ❌ |
| OTLP export (any OTel backend) | ✅ | ❌ | ✅ | ✅ | ✅ |
| Source-available, no call-home | ✅ | Partial | ✅ | ✅ | ✅ |
| CI/CD release quality gates (schema, secrets, PRRI, trust gate) | ✅ | ❌ | ❌ | ❌ | ❌ |
Bottom line: Others help you watch your AI. spanforge helps you govern it.
Install
pip install spanforge
Requires Python 3.9+. Zero mandatory dependencies.
Documented CLI entrypoints, version consistency, and known stale doc patterns are also checked in CI so public docs drift fails fast during PR validation.
Optional extras
pip install "spanforge[openai]" # OpenAI auto-instrumentation
pip install "spanforge[langchain]" # LangChain callback handler
pip install "spanforge[crewai]" # CrewAI callback handler
pip install "spanforge[http]" # Webhook + OTLP export
pip install "spanforge[datadog]" # Datadog APM + metrics
pip install "spanforge[kafka]" # Kafka EventStream source
pip install "spanforge[pydantic]" # Pydantic v2 model layer
pip install "spanforge[otel]" # OpenTelemetry SDK integration
pip install "spanforge[jsonschema]" # Strict JSON Schema validation
pip install "spanforge[llamaindex]" # LlamaIndex event handler
pip install "spanforge[gemini]" # Google Gemini auto-instrumentation
pip install "spanforge[bedrock]" # AWS Bedrock Converse API
pip install "spanforge[presidio]" # Presidio-powered PII detection
pip install "spanforge[all]" # everything above
Runtime Governance GA Surface
The GA implementation spine is the runtime-governance control plane:
sf_explainfor signed runtime explanations — now withExplainModelTypeclassification (LLM, RAG, MULTI_AGENT, CLASSIFIER, EMBEDDING), configurable retry, and fail-safe emitsf_scopefor agent capability enforcement — now with circuit-breaker fail-secure mode andACTION_CATEGORIESdictionarysf_rbacfor role enforcement on sensitive actions — now withSTANDARD_ROLE_MATRIX(10 canonical actor types), YAML manifest loading, and JWT claim extractionsf_ragfor grounding evidence and thresholdssf_lineagefor provenance capturesf_policyfor policy activation, replay, simulation, and reviewsf_operatorfor trace inspection and signed operator exportssf_enterprisefor deployment posture and enterprise evidence packaging
Start here if you want the end-to-end story instead of the full product surface:
- docs/runtime-governance.md
- docs/runtime-governance-contracts.md
- docs/replay-simulation.md
- docs/evidence-export.md
- docs/enterprise-integrations.md
- docs/competitor-comparison.md
- docs/ga-release-notes.md
- docs/demos/runtime-governance-demo.md
- docs/demos/enterprise-evidence-demo.md
Quick start — compliance in 5 minutes
1. Configure and instrument
import spanforge
spanforge.configure(
service_name="my-agent",
signing_key="your-org-secret", # HMAC audit chain — tamper-proof
redaction_policy="gdpr", # PII stripped before export
exporter="jsonl",
endpoint="audit.jsonl",
)
Every event your app emits is now signed, PII-redacted, and stored — with zero per-call boilerplate.
2. Trace AI decisions
with spanforge.start_trace("loan-approval-agent") as trace:
with trace.llm_call("gpt-4o", temperature=0.2) as span:
decision = call_llm(prompt)
span.set_token_usage(input=512, output=200, total=712)
span.set_status("ok")
3. Generate compliance evidence
from spanforge.core.compliance_mapping import ComplianceMappingEngine
engine = ComplianceMappingEngine()
package = engine.generate_evidence_package(
model_id="gpt-4o",
framework="eu_ai_act",
from_date="2026-01-01",
to_date="2026-03-31",
audit_events=events,
)
print(package.attestation.coverage_pct) # e.g. 87.5%
print(package.attestation.explanation_coverage_pct) # e.g. 75.0%
print(package.attestation.model_risk_tier) # e.g. "high"
print(package.gap_report) # what's missing
Or from the CLI:
spanforge compliance generate \
--model-id gpt-4o \
--framework eu_ai_act \
--from 2026-01-01 --to 2026-03-31 \
--events-file audit.jsonl
4. Hand to your auditor
The evidence package contains:
- Clause mappings — which telemetry events satisfy which regulatory clauses
- Gap analysis — which clauses lack evidence and need attention
- HMAC-signed attestation — cryptographic proof the evidence hasn't been tampered with
- Model governance metadata — owner, risk tier, status, warnings for deprecated/retired models
- Explanation coverage — percentage of AI decisions with explainability records
5. Package for auditors with sf-cec (v2.0.4+)
Bundle your audit records into a regulator-ready, HMAC-signed ZIP:
from spanforge.sdk import sf_cec
# Build a compliance evidence bundle for Q1 2026
result = sf_cec.build_bundle(
project_id="my-agent",
date_range=("2026-01-01", "2026-03-31"),
frameworks=["eu_ai_act", "iso_42001", "soc2"],
)
print(result.bundle_id) # sfcec_my-agent_20260401T000000Z_abc123
print(result.zip_path) # /tmp/sfcec/halluccheck_cec_my-agent_2026-01-01_2026-03-31.zip
print(result.hmac_manifest) # hmac-sha256:a3f9…
print(result.record_counts) # {"halluccheck.score.v1": 214, "halluccheck.bias.v1": 87, …}
# Verify bundle integrity before sharing
verify = sf_cec.verify_bundle(result.zip_path)
assert verify.overall_valid
# Generate a GDPR Art. 28 Data Processing Agreement
dpa = sf_cec.generate_dpa(
project_id="my-agent",
controller_details={"name": "Acme Corp", "contact": "dpo@acme.com"},
processor_details={"name": "ML Platform Team"},
)
print(dpa.document_id) # sfcec-dpa-my-agent-20260401
The ZIP bundle contains:
manifest.json— record inventory with HMAC-SHA256 signatureclause_map.json— per-framework clause satisfaction (SATISFIED / PARTIAL / GAP)chain_proof.json— audit chain verification resultattestation.json— HMAC-signed attestation metadatarfc3161_timestamp.tsr— trusted timestamp stub (RFC 3161)score_records/,bias_reports/,prri_records/,drift_events/,pii_detections/,gate_evaluations/— NDJSON evidence per schema key
6. Observe spans with sf-observe (v2.0.5+)
Export spans to any OTLP-compatible backend, emit structured annotations, and trace LLM calls with OTel GenAI semantic conventions:
from spanforge.sdk import sf_observe
# Emit a span for an LLM call — W3C traceparent + OTel GenAI attrs added automatically
span_id = sf_observe.emit_span(
"chat.completion",
{
"gen_ai.system": "openai",
"gen_ai.request.model": "gpt-4o",
"gen_ai.usage.input_tokens": 512,
"gen_ai.usage.output_tokens": 64,
},
)
print(span_id) # "a3f1b2c4d5e6f708"
# Mark a model deployment
annotation_id = sf_observe.add_annotation(
"model_deployed",
{"model": "gpt-4o", "environment": "production"},
project_id="my-agent",
)
# Health probe
print(sf_observe.healthy) # True
print(sf_observe.last_export_at) # ISO-8601 or None
# Export to any OTLP endpoint per-call
from spanforge.sdk import ReceiverConfig
result = sf_observe.export_spans(
my_spans,
receiver_config=ReceiverConfig(
endpoint="https://otel.collector.example.com/v1/traces",
headers={"Authorization": "Bearer tok"},
),
)
print(result.exported_count, result.backend)
Select backend and sampler via environment:
export SPANFORGE_OBSERVE_BACKEND=otlp # otlp | datadog | grafana | splunk | elastic | local
export SPANFORGE_OBSERVE_SAMPLER=trace_id_ratio
export SPANFORGE_OBSERVE_SAMPLE_RATE=0.25
7. Route alerts with sf-alert (v2.0.6+)
Publish topic-based alerts to Slack, PagerDuty, OpsGenie, Teams, SMS, and custom webhooks — with built-in deduplication, escalation policy, and maintenance-window suppression:
from spanforge.sdk import sf_alert
# Publish a CRITICAL drift alert
result = sf_alert.publish(
"halluccheck.drift.red",
{"model": "gpt-4o", "drift_score": 0.91},
severity="critical",
project_id="my-agent",
)
print(result.alert_id) # UUID4
print(result.suppressed) # True if deduplicated / maintenance window
# Acknowledge to cancel the 15-minute escalation timer
sf_alert.acknowledge(result.alert_id)
# Register a custom topic
sf_alert.register_topic(
"myapp.pipeline.failed",
"ML pipeline execution failure",
"high",
runbook_url="https://runbooks.example.com/pipeline",
)
Configure sinks via environment variables (zero code required):
export SPANFORGE_ALERT_TEAMS_WEBHOOK=https://xxx.webhook.office.com/...
export SPANFORGE_ALERT_OPSGENIE_KEY=og-key-...
export SPANFORGE_ALERT_DEDUP_SECONDS=300
8. Enforce release gates with sf-gate (v2.0.7+)
Run YAML-declared quality gates before every release. Block on schema violations, secrets leaks, performance regressions, unsafe PRRI scores, and trust failures — all in a single pipeline command:
from spanforge.sdk import sf_gate
# Run a full YAML gate pipeline — blocks on any FAIL gate
result = sf_gate.run_pipeline("gates/ci-pipeline.yaml")
for g in result.gate_results:
print(f"[{g.verdict.value}] {g.gate_id}") # e.g. [PASS] schema-validation
# Evaluate a single gate programmatically
verdict = sf_gate.evaluate("schema-validation", event.to_dict())
print(verdict.verdict) # GateVerdict.PASS
# Standalone PRRI evaluation
prri = sf_gate.evaluate_prri(prri_score=28.5)
print(prri.verdict) # PRRIVerdict.GREEN
# Composite trust gate — checks HRI rate, PII, and secrets windows
trust = sf_gate.get_status()
print(trust.healthy) # True if all thresholds are within bounds
Or from CI directly:
# Runs the pipeline, exits 1 if any blocking gate fails
spanforge gate run gates/ci-pipeline.yaml
# Enforce the composite trust gate as a deployment prerequisite
spanforge gate trust-gate --project-id my-agent
A minimal ci-pipeline.yaml:
version: "1.0"
gates:
- id: schema-validation
type: schema_validation
on_fail: block
- id: secrets-scan
type: secrets_scan
on_fail: block
- id: prri-check
type: halluccheck_prri
params:
red_threshold: 65
on_fail: block
- id: trust-gate
type: halluccheck_trust
on_fail: block
9. Unified config & local fallback (v2.0.8+)
Bootstrap all 8 services from a single .halluccheck.toml config block.
When a remote service is unreachable, the SDK automatically falls back to
a local-mode equivalent — no code changes required:
# .halluccheck.toml
[spanforge]
enabled = true
project_id = "my-agent"
endpoint = "https://api.spanforge.example.com"
[spanforge.services]
sf_pii = true
sf_secrets = true
sf_audit = true
sf_observe = true
[spanforge.local_fallback]
enabled = true
max_retries = 3
timeout_ms = 2000
from spanforge.sdk import load_config_file, validate_config
# Parse, validate, and apply env-var overrides in one call
config = load_config_file() # auto-discovers .halluccheck.toml
errors = validate_config(config) # [] when valid
print(config.services.sf_pii) # True
print(config.local_fallback.timeout_ms) # 2000
Validate from the CLI:
spanforge config validate # auto-discover
spanforge config validate --file .halluccheck.toml # explicit path
When a service is down, fallback activates automatically:
from spanforge.sdk import pii_fallback, secrets_fallback, audit_fallback
# Local regex PII scan (no remote service required)
result = pii_fallback("Contact alice@example.com")
print(result["entities"]) # [{"type": "EMAIL", ...}]
# Local secrets scan
result = secrets_fallback("AKIA1234567890ABCDEF")
print(result["clean"]) # False
# Local HMAC-chained JSONL audit
audit_fallback(
{"score": 0.92, "model": "gpt-4o"},
schema_key="halluccheck.score.v1",
)
The ServiceRegistry tracks health for all services and re-checks every 60 s:
from spanforge.sdk import ServiceRegistry
reg = ServiceRegistry.get_instance()
status = reg.status_response()
# {"sf_pii": {"status": "up", "latency_ms": 45, "last_checked_at": "..."}, ...}
10. T.R.U.S.T. Scorecard & HallucCheck pipelines (v2.0.9+)
The T.R.U.S.T. scorecard aggregates five trust dimensions into a single weighted score with colour-band verdicts. Each pillar maps to existing audit telemetry:
| Pillar | What it measures | Source |
|---|---|---|
| Transparency | Gate pass rate | sf_gate evaluations |
| Reliability | Hallucination rate | halluccheck.score.v1 records |
| UserTrust | Bias disparity | halluccheck.bias.v1 records |
| Security | PII + secrets hygiene | sf_pii / sf_secrets scans |
| Traceability | Compliance posture | Attestation coverage |
Colour bands: green ≥ 80, amber ≥ 60, red < 60.
from spanforge.sdk import sf_trust
# Full scorecard with all five dimensions
scorecard = sf_trust.get_scorecard(project_id="my-agent")
print(scorecard.overall_score) # 82.5
print(scorecard.colour_band) # "green"
print(scorecard.reliability) # TrustDimension(score=90.0, trend="up", ...)
# SVG badge for dashboards / README shields
badge = sf_trust.get_badge(project_id="my-agent")
with open("trust-badge.svg", "w") as f:
f.write(badge.svg)
# Historical time-series (10 buckets)
history = sf_trust.get_history(project_id="my-agent", buckets=10)
for entry in history:
print(entry.timestamp, entry.overall)
Five HallucCheck pipeline integrations orchestrate cross-service workflows:
from spanforge.sdk.pipelines import (
score_pipeline,
bias_pipeline,
monitor_pipeline,
risk_pipeline,
benchmark_pipeline,
)
# Score pipeline: PII scan → secrets scan → observe span → audit append
result = score_pipeline("The model output to check", model="gpt-4o")
print(result.audit_id, result.details)
# Risk pipeline: PRRI evaluation → alert if RED → gate block → CEC bundle
result = risk_pipeline(prri_score=75.0, project_id="my-agent")
print(result.details["verdict"]) # "RED"
From the CLI:
# T.R.U.S.T. scorecard (text table)
spanforge trust scorecard --project-id my-agent
# SVG badge to stdout
spanforge trust badge --project-id my-agent > trust.svg
# Composite trust gate (exit 1 = trust below threshold)
spanforge trust gate --project-id my-agent
11. Test with zero-network mocks (v2.0.11+)
Drop-in mock service clients for every SpanForge SDK service — no network, no configuration, no side-effects:
from spanforge.testing_mocks import mock_all_services
with mock_all_services():
from spanforge.sdk import sf_pii, sf_audit, sf_gate
# All calls are local, recorded, and return sensible defaults
result = sf_pii.scan_text("Contact alice@example.com")
assert result.clean # mock returns clean=True by default
sf_audit.append({"score": 0.92}, schema_key="halluccheck.score.v1")
assert len(sf_audit.calls) == 1 # inspect recorded calls
prri = sf_gate.evaluate_prri(prri_score=28.5)
assert prri.allow # GREEN by default
Override default returns per-method:
from spanforge.testing_mocks import MockSFPII
mock = MockSFPII()
mock.configure_response("scan_text", {"clean": False, "entities": ["EMAIL"]})
result = mock.scan_text("test")
assert not result["clean"]
Run spanforge doctor for a full environment diagnostic:
spanforge doctor
# ✅ Config valid
# ✅ All 11 services reachable
# ✅ API key not expired
# ✅ PII/secrets patterns loaded
# ✅ Gate YAML valid
Regulatory framework coverage
The ComplianceMappingEngine maps your telemetry events to specific regulatory clauses:
| Framework | Clause | Mapped events | What it proves |
|---|---|---|---|
| GDPR | Art. 22 | consent.*, hitl.* |
Automated decisions have consent + human oversight |
| GDPR | Art. 25 | llm.redact.*, consent.* |
Privacy by design — PII handled before export |
| EU AI Act | Art. 13 | explanation.* |
AI decisions are transparent and explainable |
| EU AI Act | Art. 14 | hitl.*, consent.* |
Human oversight of high-risk AI |
| EU AI Act | Annex IV.5 | llm.guard.*, llm.audit.*, hitl.* |
Technical documentation — safety + oversight |
| SOC 2 | CC6.1 | llm.audit.*, llm.trace.*, model_registry.* |
Logical access controls + model governance |
| NIST AI RMF | MAP 1.1 | llm.trace.*, llm.eval.*, model_registry.*, explanation.* |
Risk identification and mapping |
| HIPAA | §164.312 | llm.redact.*, llm.audit.* |
PHI access controls and audit |
| ISO 42001 | A.5–A.10 | Full event set | AI management system controls |
Compliance event types
spanforge defines purpose-built event types for AI governance — these aren't afterthought log messages, they are first-class compliance primitives:
| Category | Event types | Purpose |
|---|---|---|
| Consent | consent.granted, consent.revoked, consent.violation |
Track user consent for automated processing |
| Human-in-the-Loop | hitl.queued, hitl.reviewed, hitl.escalated, hitl.timeout |
Prove human oversight of AI decisions |
| Model Registry | model_registry.registered, model_registry.deprecated, model_registry.retired |
Govern model lifecycle and risk |
| Explainability | explanation.generated |
Attach explanations to AI decisions |
| Guardrails | llm.guard.* |
Safety classifier outputs and block decisions |
| PII | llm.redact.* |
Audit trail of what PII was found and removed |
| Audit | llm.audit.* |
Access logs and chain-of-custody records |
| Traces | llm.trace.* |
Model calls, tokens, latency, cost |
Core capabilities
Tamper-proof audit chains
Every event is HMAC-SHA256 signed and chained to its predecessor — the same principle as certificate chains. Alter one event and the entire chain breaks.
from spanforge.signing import AuditStream, verify_chain
stream = AuditStream(org_secret="your-secret")
for event in events:
stream.append(event)
result = verify_chain(stream.events, org_secret="your-secret")
assert result.valid # any tampering → False
PII redaction
Strip personal data before events leave your application boundary. Deep scanning with Luhn and Verhoeff validation for credit cards and Aadhaar numbers, SSN range validation (_is_valid_ssn), calendar validation for dates of birth (_is_valid_date), and built-in patterns for date_of_birth and street address.
from spanforge.redact import RedactionPolicy, Sensitivity
policy = RedactionPolicy(min_sensitivity=Sensitivity.PII, redacted_by="policy:gdpr-v1")
result = policy.apply(event)
# All PII fields → "[REDACTED by policy:gdpr-v1]"
Model registry governance
Register models with ownership and risk metadata. Attestations automatically warn when models are deprecated, retired, or unregistered.
from spanforge.model_registry import ModelRegistry
registry = ModelRegistry()
registry.register("gpt-4o", owner="ml-platform", risk_tier="high")
registry.deprecate("gpt-3.5-turbo", reason="Successor available")
# Evidence packages now include:
# model_owner: "ml-platform"
# model_risk_tier: "high"
# model_status: "active"
# model_warnings: [] (or ["model 'gpt-3.5-turbo' is deprecated"])
Explainability tracking
Measure what percentage of your AI decisions have explanations attached:
from spanforge.explain import generate_explanation
explanation = generate_explanation(
decision_event_id="evt_01HX...",
method="feature_importance",
content="Top factors: credit_score (0.42), income (0.31)...",
)
# explanation_coverage_pct in attestations = explained / total decisions
GDPR subject erasure
Right-to-erasure with tombstone events that preserve audit chain integrity:
spanforge audit erase audit.jsonl --subject-id user123
Auto-instrumentation
Patch supported providers once — compliance data flows automatically:
# Instrument all installed providers in one call
import spanforge.auto
spanforge.auto.setup()
# Or patch individually
from spanforge.integrations import openai as sf_openai
sf_openai.patch() # every OpenAI call → signed, redacted, compliant
sf_openai.unpatch() # restore original behaviour
Supported providers: OpenAI, Anthropic, Google Gemini, AWS Bedrock, Ollama, Groq, Together AI
Supported frameworks: LangChain, LlamaIndex, CrewAI
Using SpanForge alongside OpenTelemetry
spanforge is not an OTel replacement. OTel handles performance monitoring. spanforge adds the compliance layer OTel cannot provide — audit chains, PII redaction, consent tracking, and regulator-ready attestations.
# Your existing OTel pipeline stays untouched
from opentelemetry.sdk.trace import TracerProvider
provider = TracerProvider()
# Add spanforge's compliance layer alongside it
import spanforge
spanforge.configure(mode="otel_passthrough")
# Dual-stream: OTel for monitoring, spanforge for compliance
spanforge.configure(exporters=["otel_passthrough", "jsonl"], endpoint="audit.jsonl")
Export
Ship compliance events to any backend:
from spanforge.stream import EventStream
from spanforge.export.jsonl import JSONLExporter
from spanforge.export.otlp import OTLPExporter
from spanforge.export.datadog import DatadogExporter
from spanforge.export.grafana import GrafanaLokiExporter
from spanforge.export.cloud import CloudExporter
from spanforge.export.siem_splunk import SplunkHECExporter
from spanforge.export.siem_syslog import SyslogExporter
stream = EventStream(events)
await stream.drain(JSONLExporter("audit.jsonl")) # local file
await stream.drain(OTLPExporter("http://collector:4318/v1/traces")) # OTel collector
await stream.drain(DatadogExporter(service="my-app")) # Datadog APM
await stream.drain(GrafanaLokiExporter(url="http://loki:3100")) # Grafana Loki
await stream.drain(CloudExporter(api_key="sf_live_xxx")) # spanforge Cloud
await stream.drain(SplunkHECExporter()) # Splunk HEC (env-var config)
await stream.drain(SyslogExporter()) # Syslog/CEF (env-var config)
Fan-out routing for compliance alerting:
from spanforge.export.webhook import WebhookExporter
# Route guardrail violations to Slack
await stream.route(
WebhookExporter("https://hooks.slack.com/your-webhook"),
predicate=lambda e: e.event_type == "llm.guard.output.blocked",
)
CLI
38 commands — all CI-pipeline ready:
# Compliance
spanforge compliance generate --model-id gpt-4o --framework eu_ai_act \
--from 2026-01-01 --to 2026-03-31 --events-file events.jsonl
spanforge compliance check --framework eu_ai_act \
--from 2026-01-01 --to 2026-03-31 --events-file events.jsonl
spanforge compliance validate-attestation evidence.json
spanforge compliance status --events-file events.jsonl # compliance summary JSON
# Audit chain
spanforge audit-chain events.jsonl # verify chain integrity
spanforge audit erase events.jsonl --subject-id user123 # GDPR erasure
spanforge audit rotate-key events.jsonl # key rotation
spanforge audit verify --input events.jsonl # verify integrity
spanforge audit extract events.jsonl --type llm.trace.span.completed --since 2026-01-01 # filter & extract
spanforge audit cec generate --project-id my-agent --sign # CEC compliance bundle ZIP
spanforge audit gap-finder events.jsonl --threshold-minutes 30 # detect time gaps + missing fields
# Privacy & Secrets
spanforge scan events.jsonl --fail-on-match # CI-gate PII scan
spanforge secrets scan <file> # scan file for secrets (exit 0=clean, 1=found)
spanforge secrets scan <file> --format sarif # SARIF output for GitHub Code Scanning
spanforge secrets scan <file> --redact # print redacted version to stdout
# Event generation
spanforge event create --type llm.trace.span.completed --count 10 --format jsonl # generate test events
# Validation
spanforge check # 9-step end-to-end health check (--verbose for timing)
spanforge check-compat events.json # v2.0 compatibility
spanforge validate events.jsonl # JSON Schema validation
spanforge validate events.jsonl --report detailed --format json # detailed report
spanforge validate --dataset training.jsonl # scan JSONL training data for PII
spanforge validate --dataset training.jsonl --fail-on-violations # exit 1 if PII/schema issues found
spanforge validate --dataset training.jsonl --required-fields prompt,response --format json # required fields + JSON output
# Configuration
spanforge config validate # validate .halluccheck.toml (auto-discover)
spanforge config validate --file path/to.toml # validate specific config file
# Analysis
spanforge stats events.jsonl # counts, tokens, cost
spanforge stats events.jsonl --group-by model --format json # grouped stats, JSON output
spanforge inspect <EVENT_ID> events.jsonl # pretty-print one event
spanforge inspect <EVENT_ID> events.jsonl --format csv # CSV export
spanforge cost events.jsonl # token spend report
spanforge cost run --run-id <id> --input events.jsonl # per-run cost report
# Evaluation
spanforge eval save --input events.jsonl --output dataset.jsonl # extract eval dataset
spanforge eval run --file dataset.jsonl --scorers faithfulness,pii_leakage # run scorers
# Migration
spanforge migrate events.jsonl --sign # v1→v2 migration
spanforge migrate-langsmith export.jsonl # LangSmith → SpanForge conversion
spanforge list-deprecated # deprecated event types
spanforge migration-roadmap # v2 migration plan
spanforge check-consumers # consumer compatibility
# CI/CD Gate Pipeline
spanforge gate run gates/ci-pipeline.yaml # run YAML gate pipeline (exit 1 = blocking gate failed)
spanforge gate run gates/ci-pipeline.yaml --format json # JSON output for CI dashboards
spanforge gate evaluate schema-validation --payload event.json # evaluate single gate
spanforge gate trust-gate --project-id my-agent # composite trust gate check
spanforge gate audit events.jsonl --fail-on-violation # policy audit of gate records (CI gate)
# T.R.U.S.T. Scorecard
spanforge trust scorecard --project-id my-agent # five-pillar trust scorecard (text table)
spanforge trust badge --project-id my-agent # SVG badge to stdout
spanforge trust gate --project-id my-agent # composite trust gate (exit 1 = below threshold)
# Enterprise (Phase 11)
spanforge enterprise status # enterprise subsystem status JSON
spanforge enterprise health # enterprise health check (all services)
# Security (Phase 11)
spanforge security owasp # OWASP API Security Top 10 audit
spanforge security scan # full security scan (deps + static + secrets-in-logs)
spanforge security threat-model # STRIDE threat model summary
spanforge security audit-logs --path /var/log/myapp/ # secrets-in-logs detection
# Developer Experience (Phase 12)
spanforge doctor # environment diagnostics (config, services, keys, patterns)
# Viewer
spanforge serve # local SPA trace viewer
spanforge ui # standalone HTML viewer
Event namespaces
Every event carries a typed payload. The built-in namespaces:
| Prefix | Dataclass | What it records |
|---|---|---|
consent.* |
ConsentPayload |
User consent grants, revocations, violations |
hitl.* |
HITLPayload |
Human-in-the-loop review, escalation, timeout |
model_registry.* |
ModelRegistryEntry |
Model registration, deprecation, retirement |
explanation.* |
ExplainabilityRecord |
Explainability records for AI decisions |
llm.trace.* |
SpanPayload |
Model calls — tokens, latency, cost (frozen v2) |
llm.guard.* |
GuardPayload |
Safety classifier outputs, block decisions |
llm.redact.* |
RedactPayload |
PII audit — what was found and removed |
llm.audit.* |
AuditChainPayload |
Access logs and chain-of-custody |
llm.eval.* |
EvalScenarioPayload |
Scores, labels, evaluator identity |
llm.cost.* |
CostPayload |
Per-call cost in USD |
llm.cache.* |
CachePayload |
Cache hit/miss, backend, TTL |
llm.prompt.* |
PromptPayload |
Prompt template version, rendered text |
llm.fence.* |
FencePayload |
Topic constraints, allow/block lists |
llm.diff.* |
DiffPayload |
Prompt/response delta between events |
llm.template.* |
TemplatePayload |
Template registry metadata |
Architecture
spanforge/
+-- core/
│ +-- compliance_mapping.py — ComplianceMappingEngine, evidence packages, attestations
+-- compliance/ — Programmatic compliance test suite
+-- signing.py — HMAC audit chains, key management, multi-tenant KeyResolver
+-- redact.py — PII detection + redaction policies
+-- model_registry.py — Model lifecycle governance
+-- explain.py — Explainability records
+-- consent.py — Consent boundary events
+-- hitl.py — Human-in-the-loop events
+-- governance.py — Policy-based event gating
+-- event.py — Event envelope
+-- types.py — EventType enum (consent.*, hitl.*, model_registry.*, explanation.*, llm.*)
+-- config.py — configure() / get_config()
+-- _span.py — Span, AgentRun, AgentStep context managers
+-- _trace.py — Trace + start_trace()
+-- _tracer.py — Top-level tracing entry point
+-- _stream.py — Internal dispatch: sample — redact — sign — export
+-- _store.py — TraceStore ring buffer
+-- _hooks.py — HookRegistry (lifecycle hooks)
+-- _server.py — HTTP server (/traces, /compliance/summary)
+-- _cli.py ← 38 CLI sub-commands
+-- workflow.py — Human-in-the-Loop Workflow Engine (CORE-15); WorkflowEngine, WorkflowType, state machine, SLA escalation
+-- cost.py — CostTracker, BudgetMonitor, @budget_alert
+-- cache.py — SemanticCache, @cached decorator
+-- retry.py — @retry, FallbackChain, CircuitBreaker
+-- toolsmith.py — @tool, ToolRegistry
+-- http.py — Zero-dependency OpenAI-compatible HTTP client
+-- io.py — JSONL read/write/append utilities
+-- plugins.py — Entry-point plugin discovery
+-- schema.py — Lightweight zero-dependency JSON Schema validator
+-- regression.py — Pass/fail regression detector
+-- stats.py — Percentile, latency summary utilities
+-- presidio_backend.py — Optional Presidio-powered PII detection
+-- _ansi.py — ANSI color helpers (NO_COLOR aware)
+-- lint/ — AST-based instrumentation linter (AO000–AO005)
+-- export/ — JSONL, OTLP, Webhook, Datadog, Grafana Loki, Cloud, Redis, Splunk HEC, Syslog/CEF
+-- integrations/ — OpenAI, Anthropic, Gemini, Bedrock, LangChain, LlamaIndex, CrewAI, Ollama, Groq, Together
+-- namespaces/ — Typed payload dataclasses
+-- gate.py — GateRunner YAML pipeline engine, 6 gate executors, artifact store (Phase 8)
+-- sdk/ — Service SDK clients (sf-identity, sf-pii, sf-secrets, sf-audit, sf-cec, sf-observe, sf-alert, sf-gate, sf-trust, sf-enterprise, sf-security)
│ +-- explain.py — SFExplainClient – ExplainModelType enum (LLM/RAG/MULTI_AGENT/CLASSIFIER/EMBEDDING), signed explanations, retry+timeout emit (Phase 1B)
│ +-- scope.py — SFScopeClient – ACTION_CATEGORIES (5 categories), circuit-breaker fail-secure, resolve_action_category() (Phase 1B)
│ +-- rbac.py — SFRBACClient – STANDARD_ROLE_MATRIX (10 actor types), register_actor_from_yaml(), register_actor_from_jwt() (Phase 1C)
│ +-- identity.py — SFIdentityClient – keys, JWT, TOTP, MFA, magic-link
│ +-- pii.py — SFPIIClient – scan, redact, anonymize
│ +-- secrets.py — SFSecretsClient – 20-pattern secret scanning, SARIF output
│ +-- audit.py — SFAuditClient – HMAC-chained records, T.R.U.S.T. scorecard, Article 30, BYOS
│ +-- cec.py — SFCECClient – signed CEC ZIP bundles, clause mapping, DPA generation (Phase 5)
│ +-- observe.py — SFObserveClient – span export, OTel GenAI attrs, W3C TraceContext, sampling (Phase 6)
│ +-- alert.py — SFAlertClient – topic-based routing, dedup, escalation policy, 6 sink integrations (Phase 7)
│ +-- gate.py — SFGateClient – YAML pipeline runner, evaluate(), evaluate_prri(), trust-gate, artifact management (Phase 8)
│ +-- config.py — .halluccheck.toml parser, SFConfigBlock, SFServiceToggles, SFLocalFallbackConfig, validate_config() (Phase 9)
│ +-- registry.py — ServiceRegistry singleton, health checks, background checker, status_response() (Phase 9)
│ +-- fallback.py — 8 local fallback implementations: pii, secrets, audit, observe, alert, identity, gate, cec (Phase 9)
│ +-- trust.py — SFTrustClient – T.R.U.S.T. five-pillar scorecard, SVG badge, history time-series, configurable weights (Phase 10)
│ +-- pipelines.py — 5 HallucCheck pipeline integrations: score, bias, monitor, risk, benchmark (Phase 10)
│ +-- enterprise.py — SFEnterpriseClient – multi-tenancy, encryption, air-gap, health probes (Phase 11)
│ +-- security.py — SFSecurityClient – OWASP audit, STRIDE threat model, dependency/static scanning, secrets-in-logs (Phase 11)
│ +-- testing_mocks.py — 11 mock service clients, _MockBase, mock_all_services() context manager (Phase 12)
│ +-- _base.py — SFClientConfig, SFServiceClient, circuit breaker, sandbox mode (Phase 12)
│ +-- _types.py — SecretStr, APIKeyBundle, JWTClaims, BundleResult, ClauseMapEntry, ExportResult, Annotation, AlertSeverity, …
│ +-- _exceptions.py — SFError hierarchy (incl. SFConfigError, SFConfigValidationError, SFStartupError, SFServiceUnavailableError, SFTrustComputeError, SFPipelineError)
│ +-- __init__.py — sf_identity / sf_pii / sf_secrets / sf_audit / sf_cec / sf_observe / sf_alert / sf_gate / sf_trust / sf_enterprise / sf_security / sf_rag / sf_feedback singletons + configure()
+-- migrate.py — Schema migration (v1 — v2), LangSmith migration
What is inside the box
| Module | What it does | For whom |
|---|---|---|
| Compliance & Governance | ||
spanforge.compliance |
ComplianceMappingEngine maps telemetry to regulatory frameworks (EU AI Act, ISO 42001, NIST AI RMF, GDPR, SOC 2, HIPAA). Generates evidence packages with HMAC-signed attestations. Consent, HITL, model registry, and explainability events integrated into clause mappings. Attestations include model owner, risk tier, status, warnings, and explanation_coverage_pct. Also: programmatic v2.0 compatibility checks — no pytest required. |
Compliance / legal / platform teams |
spanforge.signing |
HMAC-SHA256 event signing, tamper-evident audit chains, key strength validation, key expiry checks, environment-isolated key derivation, multi-tenant KeyResolver protocol, and AsyncAuditStream |
Security / compliance teams |
spanforge.redact |
PII detection, sensitivity levels, redaction policies, deep scan_payload() with Luhn / Verhoeff / SSN-range / date-calendar validation, built-in date_of_birth and address patterns, and contains_pii() / assert_redacted() with raw string scanning |
Data privacy / GDPR teams |
spanforge.secrets |
SecretsScanner — 20-pattern registry (7 spec-defined + 13 industry-standard), Shannon entropy scoring, three-tier confidence model, zero-tolerance auto-block for 10 high-risk types, SecretsScanResult with to_dict() and SARIF 2.1.0 output, span deduplication, configurable allowlist |
Security / DevSecOps teams |
spanforge.governance |
Policy-based event gating — block prohibited types, warn on deprecated usage, enforce custom rules | Platform / compliance teams |
| Instrumentation & Tracing | ||
spanforge.event |
The core Event envelope — the one structure all tools share |
Everyone |
spanforge.types |
All built-in event types — compliance events (consent.*, hitl.*, model_registry.*, explanation.*) and telemetry events (llm.trace.*, llm.guard.*, etc.) |
Everyone |
spanforge._span |
Span, AgentRun, AgentStep context managers. contextvars-based async/thread-safe propagation. async with, span.add_event(), span.set_timeout_deadline() |
App developers |
spanforge._trace |
Trace + start_trace() — high-level tracing entry point; accumulates child spans |
App developers |
spanforge.config |
configure() and get_config() — signing key, redaction policy, exporters, sample rate |
Everyone |
| Export & Integration | ||
spanforge.export |
Ship events to JSONL, HTTP webhooks, OTLP collectors, Datadog APM, Grafana Loki, Splunk HEC, Syslog/CEF, Redis, or spanforge Cloud | Infra / compliance teams |
spanforge.export.siem_splunk |
SplunkHECExporter — thread-safe batched Splunk HTTP Event Collector exporter; env-var config; HEC token never logged; SplunkHECError on delivery failure |
Security / compliance teams |
spanforge.export.siem_syslog |
SyslogExporter — RFC 5424 and ArcSight CEF exporter over UDP or TCP; severity derived from event type; CEF extension values properly escaped; SyslogExporterError on socket failure |
Security / compliance teams |
spanforge.stream |
Fan-out router — one drain() call reaches multiple backends; Kafka source |
Platform engineers |
spanforge.integrations |
Auto-instrumentation for OpenAI, Anthropic, LangChain, LlamaIndex, CrewAI, Groq, Ollama, Together | App developers |
spanforge.auto |
setup() auto-patches all installed LLM integrations; teardown() cleanly unpatches |
App developers |
| Developer Tools | ||
spanforge.cost |
CostTracker, BudgetMonitor, @budget_alert — track and alert on token spend |
App developers / FinOps |
spanforge.cache |
SemanticCache + @cached — deduplicate LLM calls via cosine similarity; InMemoryBackend, SQLiteBackend, RedisBackend |
App developers / FinOps |
spanforge.retry |
@retry, FallbackChain, CircuitBreaker, CostAwareRouter — resilient LLM routing with compliance events |
App developers / SREs |
spanforge.toolsmith |
@tool + ToolRegistry — register functions as typed tools; render JSON schemas for function-calling APIs |
App developers |
spanforge.lint |
AST-based instrumentation linter; AO001–AO005 codes; flake8 plugin; CLI | All teams / CI |
| Utilities (v2.0.2+) | ||
spanforge.http |
chat_completion() — zero-dependency, synchronous OpenAI-compatible HTTP client with retry and back-off |
App developers |
spanforge.io |
read_jsonl(), write_jsonl(), append_jsonl(), write_events(), read_events() — JSONL I/O utilities |
Everyone |
spanforge.schema |
Lightweight zero-dependency JSON Schema validator — validate(), validate_strict() |
Tool authors / CI |
spanforge.regression |
RegressionDetector — per-case pass/fail regression detection between baseline and current eval runs |
ML / eval teams |
spanforge.stats |
percentile(), latency_summary() — statistical utilities for eval and performance analysis |
Analytics engineers |
spanforge.plugins |
discover(group) — entry-point plugin discovery across Python 3.9–3.12+ |
Plugin authors |
spanforge.presidio_backend |
Optional Presidio-powered PII detection backend — presidio_scan_payload() with standard PIIScanResult |
Data privacy teams |
spanforge.eval |
Built-in scorers: FaithfulnessScorer, RefusalDetectionScorer, PIILeakageScorer, BehaviourScorer base class |
ML / eval teams |
spanforge.debug |
print_tree(), summary(), visualize() — terminal tree, stats dict, HTML Gantt timeline |
App developers |
spanforge.metrics |
aggregate() — success rates, latency percentiles, token totals, cost breakdowns |
Analytics engineers |
spanforge.testing |
MockExporter, capture_events(), assert_event_schema_valid(), trace_store() |
Test authors |
spanforge.testing_mocks |
11 drop-in mock service clients (MockSFIdentity, MockSFPII, MockSFSecrets, MockSFAudit, MockSFObserve, MockSFGate, MockSFCEC, MockSFAlert, MockSFTrust, MockSFEnterprise, MockSFSecurity). mock_all_services() context manager patches all 11 singletons. _MockBase with .calls recording and .configure_response(). 100% test coverage. (Phase 12, v2.0.11+) |
Test authors / all teams |
spanforge.validate |
JSON Schema validation against the published v2.0 schema | All teams |
spanforge.namespaces |
Typed payload dataclasses for all built-in event namespaces | Tool authors |
spanforge.models |
Optional Pydantic v2 models for validated schemas | API / backend teams |
spanforge.consumer |
Declare schema-namespace dependencies; fail fast at startup if version requirements are not met | Platform teams |
spanforge.deprecations |
Per-event-type deprecation notices at runtime | Library maintainers |
spanforge._hooks |
Lifecycle hooks: @hooks.on_llm_call, @hooks.on_tool_call, @hooks.on_agent_start (sync + async) |
App developers / platform |
spanforge._store |
TraceStore ring buffer — get_trace(), list_tool_calls(), list_llm_calls() |
Platform / tooling engineers |
spanforge._cli |
CLI sub-commands including eval, compliance status, migrate-langsmith, cost run, and more | DevOps / CI teams |
| Service SDK (v2.0.3+) | ||
spanforge.sdk.identity |
SFIdentityClient — API key lifecycle (issue_api_key, rotate_api_key, revoke_api_key), session JWT (HS256 stdlib / RS256 remote), magic-link issuance + single-use exchange, TOTP enrolment + verification (RFC 6238, 6-digit, 30 s), backup codes, per-key IP allowlist, sliding-window rate limiting, brute-force lockout. Fully local-mode capable — no external service required. |
Security / platform teams |
spanforge.sdk.pii |
SFPIIClient — scan_text(), anonymise(), scan_batch(), apply_pipeline_action(), get_status(), erase_subject() (GDPR Art. 17), export_subject_data() (CCPA DSAR), safe_harbor_deidentify() (HIPAA 18-PHI), audit_training_data() (EU AI Act Art. 10), get_pii_stats(). PIPL patterns for Chinese national ID / mobile / bank card. Pipeline action routing (flag / redact / block) with confidence threshold gate. Scan results never include raw PII — only type labels, field paths, and SHA-256 hashes. Runs locally or delegates to a remote sf-pii service. |
Data privacy / GDPR teams |
spanforge.sdk.secrets |
SFSecretsClient — scan(text) → SecretsScanResult, scan_batch(texts) with asyncio parallel execution. 20-pattern registry covering all spec-required types plus 13 industry-standard additions. Three-tier confidence model (0.75 / 0.90 / 0.97). Zero-tolerance auto-block for 10 high-risk secret types. SARIF 2.1.0 output. Runs fully locally — no external service required. |
Security / DevSecOps teams |
spanforge.sdk.audit |
SFAuditClient — append(record, schema_key) with HMAC-SHA256 chaining, query() SQLite index with full-text and date-range filters, verify_chain() tamper detection, get_trust_scorecard() T.R.U.S.T. dimensions (hallucination · PII hygiene · secrets hygiene · gate pass-rate · compliance posture), generate_article30_record() GDPR Article 30 RoPA, export() JSONL/CSV/compressed, sign(), get_status(). BYOS routing via SPANFORGE_AUDIT_BYOS_PROVIDER (S3 / Azure / GCS / R2). Strict-schema mode, configurable retention years, optional SQLite persistence. 123 tests, 85 % coverage, mypy strict clean. |
Compliance / security / audit teams |
spanforge.sdk.observe |
SFObserveClient — emit_span(name, attributes) builds OTel-compliant spans with W3C traceparent / baggage injection and OTel GenAI semantic attributes; export_spans(spans, receiver_config=...) routes to local / otlp / datadog / grafana / splunk / elastic; add_annotation(event_type, payload) / get_annotations(event_type, from_dt, to_dt) annotation store; get_status(), healthy, last_export_at health probes. Sampling via SPANFORGE_OBSERVE_SAMPLER (always_on / always_off / parent_based / trace_id_ratio). 139 tests, 97% coverage, mypy strict + bandit clean. (Phase 6, v2.0.5+) |
Platform / MLOps / observability teams |
spanforge.sdk.alert |
SFAlertClient — publish(topic, payload, *, severity, project_id) → PublishResult routes to all configured sinks with deduplication, rate-limiting, alert grouping, and maintenance-window suppression; acknowledge(alert_id) cancels CRITICAL escalation; register_topic() custom topic registry; set_maintenance_window() / remove_maintenance_windows(); get_alert_history() with filtering; get_status() / healthy health probes. Built-in sinks: WebhookAlerter (HMAC), OpsGenieAlerter, VictorOpsAlerter, IncidentIOAlerter, SMSAlerter (Twilio), TeamsAdaptiveCardAlerter. Auto-discovery from SPANFORGE_ALERT_* env vars. Per-sink circuit breakers. 95 tests, mypy strict + bandit clean. (Phase 7, v2.0.6+) |
Platform / SRE / on-call teams |
spanforge.sdk.gate |
SFGateClient — evaluate(gate_id, payload) → GateEvaluationResult, evaluate_prri(prri_score) → PRRIResult, run_pipeline(gate_config_path) → GateRunResult, get_artifact(gate_id), list_artifacts(), purge_artifacts(older_than_days), get_status() → GateStatusInfo, configure(config). Six built-in gate executors: schema_validation, dependency_security, secrets_scan, performance_regression, halluccheck_prri, halluccheck_trust. PRRI three-tier verdict (GREEN/AMBER/RED), GateArtifact store with configurable retention, composite trust gate (HRI rate + PII window + secrets window), five exception types. 174 tests, mypy strict + bandit clean. (Phase 8, v2.0.7+) |
DevOps / CI / platform teams |
spanforge.sdk.config |
load_config_file(path?) — auto-discovers .halluccheck.toml or falls back to env-var defaults. validate_config(block) / validate_config_strict(block) schema validation. SFConfigBlock, SFServiceToggles, SFLocalFallbackConfig, SFPIIConfig, SFSecretsConfig typed dataclasses. Env-var overrides: SPANFORGE_ENDPOINT, SPANFORGE_API_KEY, SPANFORGE_PROJECT_ID, SPANFORGE_PII_THRESHOLD, SPANFORGE_SECRETS_AUTO_BLOCK, SPANFORGE_LOCAL_TOKEN, SPANFORGE_FALLBACK_TIMEOUT_MS. (Phase 9, v2.0.8+) |
All teams / platform engineers |
spanforge.sdk.registry |
ServiceRegistry.get_instance() — thread-safe singleton holding all 11 service clients. run_startup_check() pings all enabled services (status: up / degraded / down). status_response() returns per-service {status, latency_ms, last_checked_at}. start_background_checker() launches a daemon thread re-checking every 60 s. ServiceHealth, ServiceStatus typed enums. (Phase 9, v2.0.8+) |
Platform / SRE teams |
spanforge.sdk.fallback |
8 local-mode fallback implementations: pii_fallback() (regex scan), secrets_fallback() (regex scan), audit_fallback() (HMAC-chained JSONL), observe_fallback() (OTLP JSON to stdout), alert_fallback() (log to stderr), identity_fallback() (trust local token), gate_fallback() (local gate engine), cec_fallback() (local JSONL). All emit WARNING when active. (Phase 9, v2.0.8+) |
All teams (automatic) |
spanforge.sdk.trust |
SFTrustClient — get_scorecard(project_id, *, from_dt, to_dt, weights) → TrustScorecardResponse aggregates five T.R.U.S.T. dimensions (Transparency · Reliability · UserTrust · Security · Traceability) with configurable weights. get_badge(project_id) → TrustBadgeResult generates an SVG badge with colour-band (green ≥ 80, amber ≥ 60, red < 60). get_history(project_id, *, buckets) → list[TrustHistoryEntry] returns time-series snapshots. get_status() health probe. Reads from sf-audit trust records. 28 tests, mypy strict + bandit clean. (Phase 10, v2.0.9+) |
Compliance / platform / ML teams |
spanforge.sdk.pipelines |
5 HallucCheck ↔ SpanForge pipeline integrations: score_pipeline(text) (PII → secrets → observe → audit), bias_pipeline(report) (PII → audit → alert → anonymise), monitor_pipeline(event) (observe → alert → OTel export), risk_pipeline(prri_score) (PRRI → alert → gate → CEC), benchmark_pipeline(results) (audit → alert → anonymise). Each returns PipelineResult with audit trail. (Phase 10, v2.0.9+) |
ML / eval / platform teams |
SFCECClient — build_bundle(project_id, date_range, frameworks) assembles a signed ZIP with manifest.json, clause_map.json, chain_proof.json, attestation.json, rfc3161_timestamp.tsr, and 6 NDJSON evidence directories. HMAC-SHA256 manifest signing, BYOS detection. verify_bundle(zip_path) re-verifies HMAC + chain + timestamp. generate_dpa(project_id, controller_details, processor_details) produces a GDPR Article 28 Data Processing Agreement. get_status() returns bundle count, BYOS provider, and last bundle timestamp. Supports all 5 frameworks: eu_ai_act, iso_42001, nist_ai_rmf, iso27001, soc2. 148 tests, 87% coverage, mypy strict + bandit clean. (Phase 5, v2.0.4+) |
Compliance / legal / audit teams | |
spanforge.sdk |
Pre-built sf_identity, sf_pii, sf_secrets, sf_audit, sf_cec, sf_observe, sf_alert, sf_gate, sf_trust, sf_enterprise, sf_security, sf_rag, and sf_feedback singletons loaded from env vars on first import. SFClientConfig, SecretStr, full exception hierarchy (SFAuthError, SFBruteForceLockedError, SFPIINotRedactedError, SFPIIBlockedError, SFPIIDPDPConsentMissingError, SFSecretsBlockedError, SFAuditSchemaError, SFAuditAppendError, SFAuditQueryError, SFCECError, SFCECBuildError, SFCECVerifyError, SFCECExportError, SFObserveError, SFObserveExportError, SFObserveEmitError, SFObserveAnnotationError, SFAlertError, SFAlertPublishError, SFAlertRateLimitedError, SFAlertQueueFullError, SFGateError, SFGateEvaluationError, SFGatePipelineError, SFGateTrustFailedError, SFGateSchemaError, SFConfigError, SFConfigValidationError, SFStartupError, SFServiceUnavailableError, SFTrustComputeError, SFPipelineError, SFEnterpriseError, SFIsolationError, SFDataResidencyError, SFEncryptionError, SFFIPSError, SFAirGapError, SFSecurityScanError, SFSecretsInLogsError, …), and all value-object types exported from the top-level package. load_config_file(), validate_config(), validate_config_strict(), ServiceRegistry, and 8 fallback functions re-exported for convenience. |
All teams |
Quality
- 5 863 tests passing (14 skipped) — unit, integration, property-based (Hypothesis), performance benchmarks
- ≥ 91% line and branch coverage — 90% minimum enforced in CI
- Zero required dependencies — entire core runs on Python stdlib
- Typed — full
py.typedmarker; mypy + pyright clean - Frozen v2 trace schema —
llm.trace.*payload fields never break between minor releases - Async-safe —
contextvars-based context propagation across asyncio, threads, and executors
Development
git clone https://github.com/veerarag1973/spanforge.git
cd spanforge
python -m venv .venv && .venv\Scripts\activate
pip install -e ".[dev]"
pytest # 5 351 tests
Code quality
ruff check . && ruff format .
mypy spanforge
pytest --cov # >=90% required
Build docs
pip install -e ".[docs]"
cd docs && sphinx-build -b html . _build/html
Versioning
spanforge implements RFC-0001 (AI Compliance Standard for Agentic AI Systems). Current schema version: 2.0.
This project follows Semantic Versioning. The llm.trace.* namespace is additionally frozen at v2 — even major releases won't remove fields from SpanPayload, AgentRunPayload, or AgentStepPayload.
See docs/changelog.md for the full version history.
Contributing
Contributions welcome — see the Contributing Guide. All new code must maintain ≥ 90% coverage. Run ruff and mypy before submitting.
Community
- Discussions — questions, ideas, show-and-tell
- Issues — bug reports and feature requests
- SECURITY.md — responsible disclosure process
- Code of Conduct — Contributor Covenant v2.1
Topics:
ai-complianceai-governanceeu-ai-actgdprsoc2audit-trailpii-redactionhmac-signingllm-governancepython
License
PolyForm Noncommercial License 1.0.0
- ✅ Free for personal use, research, education, open-source projects, and non-profit organisations.
- ❌ Commercial use (running as a paid service, internal business use, SaaS integration) requires a commercial license.
To obtain a commercial license: sriram@getspanforge.com | getspanforge.com/pricing
Enterprise features (SSO, air-gapped deployment, dedicated support, SLAs) are available in SpanForge Enterprise — a separate commercial product.
Built for teams that take AI governance seriously.
Docs —
Runtime Governance —
Quickstart —
API Reference —
Discussions —
Report a bug
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file spanforge-1.0.1.tar.gz.
File metadata
- Download URL: spanforge-1.0.1.tar.gz
- Upload date:
- Size: 1.5 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6e1a6bcf0c48d357602a9f117e5b32b0e71b2e3be5d5f832228017a455387c74
|
|
| MD5 |
40dbdb2141a9339da40a21663d88b615
|
|
| BLAKE2b-256 |
d136940b5d34b98e9286216b90b9754d822df54abcfab0d6df3065f0d96fa913
|
File details
Details for the file spanforge-1.0.1-py3-none-any.whl.
File metadata
- Download URL: spanforge-1.0.1-py3-none-any.whl
- Upload date:
- Size: 747.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
cf505ce3877cfb5427cfeab794263aebacca3e5b806bd672fcfc46e1e3935dd9
|
|
| MD5 |
00da540bc24b94d551454d77e79bd41a
|
|
| BLAKE2b-256 |
346c30bb3156c8dfe27b2dc7dc67931c010d92c7a606b1d3927aba44bf2acf19
|