CFA — Governed execution for AI agents and data systems
Project description
CFA v1.0.0
Governed execution for AI agents and data systems.
Instead of asking "which agent or skill should act?", CFA asks "which state transition is being requested, under which constraints, and can it be executed safely?" and produces a cryptographically verifiable decision.
Status: alpha (
0.1.x). APIs may shift between minor versions. Not yet recommended for unsupervised production use.
Quick Start
pip install cfa-kernel
# or: pip install git+https://github.com/marquesantero/cfa.git
cfa init
cfa evaluate "Join NFe with Clientes and persist to Silver" --catalog .cfa/catalog.json
What CFA does
| Step | What happens |
|---|---|
| Formalize | Natural language or JSON → typed StateSignature contract |
| Govern | Policy Engine evaluates PII, cost, schema, partition constraints |
| Generate | Execution planner + deterministic code generation (PySpark, SQL, dbt) |
| Execute | Pluggable sandbox with metrics collection + runtime validation |
| Validate | State projection, SHA-256 audit trail, lifecycle indices |
Surfaces
All interfaces are backend-agnostic. CFA evaluates a StateSignature contract — however it was produced.
| Surface | For | Example |
|---|---|---|
cfa CLI |
Everyone | cfa policy check --signature sig.json |
cfa catalog CLI |
Data platform teams | cfa catalog validate catalog.json |
cfa policy CLI |
Security/compliance | cfa policy validate policies/prod.yaml |
cfa storage CLI |
Operations | cfa storage stats --db cfa.db |
cfa lifecycle CLI |
Platform teams | cfa lifecycle evaluate --db cfa.db |
cfa signature CLI |
External systems | cfa signature validate request.json |
cfa.testing |
CI/CD | evaluate("intent", catalog=catalog) with pytest |
cfa.runtime |
Production | RuntimeGate as decorator/context-manager |
cfa.mcp |
AI agents | MCP server for any MCP-compatible client |
cfa.adapters |
AI frameworks | LangGraph, OpenAI Agents, CrewAI, AutoGen, DSPy |
Architecture
CLI / MCP / Adapter / API
│
▼
┌─ Formalize ──┐ NL / JSON / Tool call → typed StateSignature contract
├─ Govern ──────┤ Policy check + REPLAN cycle (approve / replan / block)
├─ Generate ────┤ Plan + code (PySpark / SQL / dbt) + static validation
├─ Execute ─────┤ Pluggable sandbox + runtime validation
└─ Validate ────┘ State projection + SHA-256 audit + lifecycle indices
│
▼
Decision JSON / Audit Trail / OTel / Prometheus
Capabilities
| Capability | What it gives you |
|---|---|
| SHA-256 audit trail | Tamper-evident chain of decisions, verifiable offline (cfa audit verify) |
| State projection | Each execution carries the typed state of the prior one — no implicit globals |
| Lifecycle indices (IFo/IFs/IFg/IDI) | Quantifies how often an intent recurs, stabilizes, and qualifies for promotion to a reusable skill |
| REPLAN cycle | Failed policy checks emit a structured remediation, not a hard stop |
| Backend-agnostic codegen | Same signature compiles to PySpark, ANSI SQL, or dbt — pluggable via BackendRegistry |
| Artifact hashing | Catalog, policy bundle, and signature are content-hashed and bound to every decision |
| MCP protocol | Any MCP-compatible agent can call CFA as a governance tool |
| SQLite + JSONL storage | First-class persistence with stats, retention cleanup, and vacuum |
| Config auto-discovery | cfa.yaml walked up the tree; all CLI commands respect it |
| Zero core dependencies | Optional extras for yaml, otel, mcp, llm — none required for the kernel |
CLI
# Governance & evaluation
cfa evaluate "intent" --catalog catalog.json --strict
cfa policy check --signature signature.json --policy-bundle policies/prod.yaml
cfa policy check --signature sig.json --catalog cat.json --strict --audit-log audit.jsonl
# Validation (CI-ready with JSON output and exit codes)
cfa catalog validate catalog.json --require-datasets --format json
cfa signature validate signature.json --format json
cfa policy validate policies/prod.yaml --format json
# Audit & verification
cfa audit show --id INTENT_ID --file audit.jsonl --format json
cfa audit verify --file audit.jsonl
# Policy rules
cfa rules list
cfa rules explain FAULT_CODE
# Storage management
cfa storage stats --db cfa.db --format json
cfa storage cleanup --db cfa.db --retention 90
cfa storage vacuum --db cfa.db
# Lifecycle management
cfa lifecycle evaluate --db cfa.db --window 30
cfa lifecycle list --db cfa.db
# Project health
cfa status --format json
# Bootstrap
cfa init
# Backends
cfa backend list
From Python
from cfa.testing import evaluate, assert_passed
result = evaluate(
"Join NFe with Clientes and persist to Silver",
catalog=MY_CATALOG,
policy_rules=my_rules,
backend="pyspark",
)
assert_passed(result)
Policy check with audit
from cfa.policy.engine import PolicyEngine
from cfa.types import StateSignature
signature = StateSignature.from_dict(signature_dict)
engine = PolicyEngine(policy_bundle_version="prod-v1.0")
result = engine.evaluate(signature)
# result.action → approve / replan / block
Runtime gate
from cfa.runtime import RuntimeGate, GateConfig
gate = RuntimeGate(
config=GateConfig(policy_bundle="prod_v1.0", sandbox="mock"),
catalog=PROD_CATALOG,
)
@gate.guard("aggregate sales with PII protected")
def my_pipeline():
...
SQLite storage
from cfa.storage import SqliteStorage
store = SqliteStorage("cfa.db")
store.ensure_schema()
# Audit
store.audit_append(event)
# Execution records (lifecycle)
store.execution_append(record_dict)
# Lifecycle skills
store.skill_upsert("hash_a", skill_data)
Policy Bundles
Declarative YAML policy rules — separate governance from code:
# policies/prod-v1.yaml
policy_bundle:
version: "prod-v1.0"
rules:
- name: forbid_raw_pii
condition: pii_in_protected_layer
action: block
fault_code: GOVERNANCE_RAW_PII
severity: critical
message: "PII in protected layer without anonymization."
remediation:
- "Apply sha256 on PII columns before the operation"
Validated at load time — unknown conditions, duplicate fault codes, and invalid enums are caught immediately.
Config File
# cfa.yaml (auto-discovered by all commands)
version: "1.0"
storage:
backend: sqlite
path: cfa.db
retention_days: 90
defaults:
catalog: .cfa/catalog.json
policy_bundle: .cfa/policies/prod-v1.yaml
backend: pyspark
Backends
Three governed code generation backends, all pluggable via BackendRegistry:
| Backend | Language | Features |
|---|---|---|
pyspark |
PySpark + Delta Lake | Merge, partition overwrite, PII anonymization |
sql |
ANSI SQL | MERGE INTO, INSERT OVERWRITE, partition clauses |
dbt |
dbt models + schema.yml | Config blocks, refs, not_null/unique tests, PII annotations |
Each backend declares its own forbidden tokens for static validation.
MCP Server
Expose CFA governance to any AI agent via Model Context Protocol:
{
"mcpServers": {
"cfa": {
"command": "python",
"args": ["-m", "cfa.mcp"]
}
}
}
5 tools: cfa_evaluate_signature, cfa_describe_rules, cfa_explain_fault, cfa_audit_check, cfa_list_backends.
Repository
src/cfa/
├── core/ Kernel, Planner, CodeGen, Conditions, Phases
├── policy/ PolicyEngine, PolicyBundle, Catalog validation
├── governance/ Standalone governance API (no LLM, no execution required)
├── validation/ Static, Runtime, Signature validation
├── resolution/ Intent → StateSignature resolver (LLM or rule-based backend)
├── normalizer/ Rule-based normalizer, LLM normalizer
├── behavior/ BehaviorSpec + Systematizer (human intent → policy rules)
├── audit/ AuditTrail, Context, Hashing
├── observability/ Metrics, OTel, Notify, Indices, Promotion
├── lifecycle/ IFo/IFs/IFg/IDI indices + Promotion/Demotion engine
├── execution/ Partial execution, State projection
├── adapters/ LangGraph, OpenAI, CrewAI, AutoGen, DSPy
├── backends/ PySpark, SQL, dbt (pluggable)
├── sandbox/ Pluggable sandbox backend + registry + executor
├── cli/ CLI commands by family (core/, governance/, reporting/, project/, infrastructure/)
├── storage/ SQLite + JSONL backends (stats, cleanup, vacuum)
├── mcp/ MCP server (JSON-RPC over stdio)
├── reporting/ HTML reports
├── runtime/ Production governance gate
├── testing/ pytest-native evaluate() + fixtures
├── config.py CFA config (discovery, defaults)
├── types.py StateSignature, Fault, KernelResult
└── _lazy.py Reusable lazy loader for package __init__
Docs
All documentation at marquesantero.github.io/cfa:
Demos
Two complete notebooks, tested on Databricks with CFA v1.0.0, 0 errors:
| File | Format | Description |
|---|---|---|
demos/cfa_demo_complete |
.dbc / .py |
Rule-based governance — APPROVE, REPLAN, BLOCK, codegen, audit, storage |
demos/cfa_llm_demo_complete |
.dbc / .py |
LLM-powered — semantic normalizer, systematizer, strict mode, compare |
Import the .dbc into Databricks or run the .py files anywhere.
Contributing
See CONTRIBUTING.md for development setup, test conventions, and the PR checklist. By participating, you agree to the Code of Conduct. Security issues: see SECURITY.md.
License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file cfa_kernel-1.0.0.tar.gz.
File metadata
- Download URL: cfa_kernel-1.0.0.tar.gz
- Upload date:
- Size: 551.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e0dc6fa54c7c82cd50d60d37bffb5b419f758beba41eb3825928d759289962f7
|
|
| MD5 |
d845199110eec53ae2761ce2b40a5996
|
|
| BLAKE2b-256 |
5ce721475ce11be619cdd2a6a544906f501ec71558b969b738daec897f99fdd4
|
File details
Details for the file cfa_kernel-1.0.0-py3-none-any.whl.
File metadata
- Download URL: cfa_kernel-1.0.0-py3-none-any.whl
- Upload date:
- Size: 157.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
52f7bb3b3fdb6dfce7aebde05b01ac64ee4a1de63e6b101e271e8534b802d68e
|
|
| MD5 |
071e7471635a15d8006e8260fc413286
|
|
| BLAKE2b-256 |
9edba5472391638970d28e81fe7ecab3ba3eb657df3824e32379251d119e5c28
|