Deterministic governance harness for AI workflows
Project description
Razi — Deterministic Governance Harness for AI Workflows
The model proposes. The harness authorizes.
Razi wraps AI-driven reasoning inside a constraint-governed runtime that enforces schema compliance, policy rules, evidence traceability, and deterministic replay — for any workflow where the output must be trusted.
Standard LLM wrappers have no governance layer. Prompt engineering alone cannot guarantee that outputs are schema-valid, policy-compliant, and traceable. When AI makes a consequential decision, there is no audit trail and no way to prove what happened six months later.
Razi solves this by separating three concerns:
Specification → what must be true (.aispec)
Synthesis → what the model proposes (synth.json operator)
Authority → what is allowed to execute (policy engine)
Example Use Cases
Razi is a general-purpose governance harness. It is not limited to a single workflow type. Any AI task that requires deterministic, auditable, policy-enforced output is a good fit.
| Workflow | What Razi Governs |
|---|---|
| Support Escalation | SLA compliance, severity bounds, no internal note leakage, evidence-backed decisions |
| Zendesk Troubleshooting | Schema-valid troubleshooting plans with cited evidence from ticket history |
| Customer Sentiment Analysis | Structured, schema-validated sentiment output with retry on malformed responses |
| Clinical Summarization (planned) | HIPAA-compliant output with no PII disclosure, evidence-required citations |
| Financial Decisions (planned) | SOX-controlled decision outputs with full audit trail |
Quick Start
pip install razi
export OPENAI_API_KEY=sk-...
# Clone and run the reference escalation workflow
git clone https://github.com/razi-sh/razi
cd razi
razi validate examples/escalation.aispec
razi build examples/escalation.aispec
razi run examples/escalation.aispec --input examples/inputs/sla_breach.json
# Replay the run offline (no model call)
razi replay escalation_qualification__<run_id>
How It Works
1. You write a .aispec
Declare your policy rules, model binding, evidence fields, and output schema. One file describes the entire governed workflow.
2. razi build compiles it deterministically
Produces an immutable IR + DAG + lockfile. The lockfile hashes the spec and template so every run is tied to an exact compiled state.
3. razi run enforces policy on every execution
- Builds a deterministic evidence index (every claim gets an evidence ID)
- Calls the model and validates the JSON output against your schema
- Checks every configured policy rule
- If any rule fails: constructs a targeted reprompt and retries
- If attempts are exhausted: fails safely — no non-compliant output passes
- Overwrites
policy_compliantandviolations— the model's self-assessment is never trusted
4. razi replay proves determinism
Re-evaluate schema, evidence, and policy offline, without hitting the model. This is your audit proof — run it against any historical run_id.
Writing a Spec
A .aispec file describes your governed workflow in YAML:
name: zendesk_troubleshooter
version: 1.0.0
policy: enterprise_support_v1
model:
provider: openai
id: gpt-4o-mini
temperature: 0.1
input_schema: examples/schemas/zendesk_input.json
output_schema: examples/schemas/zendesk_output.json
operators:
- type: evidence.index
id: evidence
fields:
- key: subject source: input.subject
- key: description source: input.description
- key: comments source: input.injected_comments
- type: synth.json
id: synthesis
evidence_ref: evidence
template: examples/templates/zendesk_troubleshoot.txt
max_attempts: 3
confidence_threshold: 0.7
See docs/spec_reference.md for the full format reference.
Policy Presets
Razi governs AI output through policy presets — named rule bundles declared in one line of your .aispec:
policy: enterprise_support_v1
The runtime enforces rules after the model responds. policy_compliant and violations are always written by the runtime — the model cannot self-certify. Each rule within a preset can be individually tuned per-workflow.
v1 ships with one preset: enterprise_support_v1. It covers 6 rules applicable to any workflow where AI decisions must be evidence-backed, bounded by severity constraints, and free of internal data leakage:
| Rule | What It Enforces |
|---|---|
evidence_required |
At least 1 valid evidence ID must be cited |
no_hallucinated_evidence |
Every cited ID must exist in the evidence index |
min_confidence |
AI confidence must meet configured threshold (default: 0.6) |
no_internal_disclosure |
Output cannot contain content from configured internal-only fields |
sla_escalation |
SLA breach → must recommend S1 or S2 (support workflows) |
severity_downgrade_protection |
S1 cannot be downgraded to S3/S4 (incident management) |
See docs/policy_reference.md for per-rule documentation and how to add new presets.
CLI Reference
razi validate <spec> Validate .aispec syntax and schema links
razi build <spec> Compile to immutable build artifact
razi run <spec> --input <f> Execute with governance enforcement
razi replay <run_id> Replay offline — no model call
Run artifacts are written to:
runs/<name>__<timestamp>__<hash>/
evidence_index.json
attempts/
attempt_1/ {prompt.txt, model_raw.txt, parsed_model_output.json, policy_eval.json}
attempt_2/ ...
final_output.json
trace.jsonl
replay_report.json
Design Principles
The runtime is the authority, not the model. policy_compliant and violations are always set by the runtime. The model's self-assessment is overwritten. This is not optional.
Determinism over flexibility. v1 has exactly two operators, one policy preset, and one model provider. Generalization is explicitly deferred. Razi is a wedge, not a platform.
Governance through structure, not prompting. Rules are not embedded in prompts and hoped to be followed. They are compiled into the runtime and enforced structurally on every execution.
Replay is not optional. Every run is replayable offline. If your AI makes a consequential decision, you must be able to prove what happened six months later without calling the model again.
What Razi Is Not (v1 Scope)
Razi v1 is intentionally minimal. The following are explicitly out of scope:
- Python SDK / programmatic API (Python is the substrate, not the product)
- UI / dashboard
- Branching, map/reduce, or multi-step DAG logic
- Multiple model providers
- Plugin system or general-purpose operators
- Streaming outputs
If you need these things, Razi is not the right tool yet. If you need deterministic governance on any AI workflow, it is.
Roadmap
v1.0 (Now)
evidence.index+synth.jsonoperatorsenterprise_support_v1policy presetrazi build/validate/run/replayCLI- Deterministic run artifacts and offline replay
- Reference specs: escalation qualification, Zendesk troubleshooting, sentiment analysis
v1.1
- Additional example workflows
- Improved reprompt failure diagnostics
- Policy rule unit test harness
v1.2
hipaa_clinical_summary_v1policy preset (in development)- Second model provider (Anthropic)
v2.0
- Managed governance service (invite-only beta)
- Central spec registry + cross-team replay index
PRs welcome. See CONTRIBUTING.md.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file razi-1.0.0.tar.gz.
File metadata
- Download URL: razi-1.0.0.tar.gz
- Upload date:
- Size: 44.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
49f98e0f167c2ae6277b92b871852c844433e838e9e683c1e1f5895522dd4051
|
|
| MD5 |
6b3606d94a92ee5e14be7614d4e1782a
|
|
| BLAKE2b-256 |
3e174620d34c94d55132d84c902308d6bcd39e0ece9c449ca6fb0941d7432ff9
|
File details
Details for the file razi-1.0.0-py3-none-any.whl.
File metadata
- Download URL: razi-1.0.0-py3-none-any.whl
- Upload date:
- Size: 27.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
65083c7add9777a728bf12fd1fe3ab2fcb9f520263e3d83cde486a7db4e56abb
|
|
| MD5 |
afb2bb76d72ef944f9880d9b420b3883
|
|
| BLAKE2b-256 |
1b0eb5b644402a6da1f6f47eae9ca3e755e72c29b18451d8198591ba812ae825
|