CFA — Governed execution for AI agents and data systems

These details have not been verified by PyPI

Project links

Project description

CFA v1.0.0

Governed execution for AI agents and data systems.

Instead of asking "which agent or skill should act?", CFA asks "which state transition is being requested, under which constraints, and can it be executed safely?" and produces a cryptographically verifiable decision.

Status: alpha (0.1.x). APIs may shift between minor versions. Not yet recommended for unsupervised production use.

Quick Start

pip install cfa-kernel
# or: pip install git+https://github.com/marquesantero/cfa.git
cfa init
cfa evaluate "Join NFe with Clientes and persist to Silver" --catalog .cfa/catalog.json

What CFA does

Step	What happens
Formalize	Natural language or JSON → typed `StateSignature` contract
Govern	Policy Engine evaluates PII, cost, schema, partition constraints
Generate	Execution planner + deterministic code generation (PySpark, SQL, dbt)
Execute	Pluggable sandbox with metrics collection + runtime validation
Validate	State projection, SHA-256 audit trail, lifecycle indices

Surfaces

All interfaces are backend-agnostic. CFA evaluates a StateSignature contract — however it was produced.

Surface	For	Example
`cfa` CLI	Everyone	`cfa policy check --signature sig.json`
`cfa catalog` CLI	Data platform teams	`cfa catalog validate catalog.json`
`cfa policy` CLI	Security/compliance	`cfa policy validate policies/prod.yaml`
`cfa storage` CLI	Operations	`cfa storage stats --db cfa.db`
`cfa lifecycle` CLI	Platform teams	`cfa lifecycle evaluate --db cfa.db`
`cfa signature` CLI	External systems	`cfa signature validate request.json`
`cfa.testing`	CI/CD	`evaluate("intent", catalog=catalog)` with pytest
`cfa.runtime`	Production	`RuntimeGate` as decorator/context-manager
`cfa.mcp`	AI agents	MCP server for any MCP-compatible client
`cfa.adapters`	AI frameworks	LangGraph, OpenAI Agents, CrewAI, AutoGen, DSPy

Architecture

CLI / MCP / Adapter / API
        │
        ▼
   ┌─ Formalize ──┐   NL / JSON / Tool call → typed StateSignature contract
   ├─ Govern ──────┤   Policy check + REPLAN cycle (approve / replan / block)
   ├─ Generate ────┤   Plan + code (PySpark / SQL / dbt) + static validation
   ├─ Execute ─────┤   Pluggable sandbox + runtime validation
   └─ Validate ────┘   State projection + SHA-256 audit + lifecycle indices
                           │
                           ▼
            Decision JSON / Audit Trail / OTel / Prometheus

Capabilities

Capability	What it gives you
SHA-256 audit trail	Tamper-evident chain of decisions, verifiable offline (`cfa audit verify`)
State projection	Each execution carries the typed state of the prior one — no implicit globals
Lifecycle indices (IFo/IFs/IFg/IDI)	Quantifies how often an intent recurs, stabilizes, and qualifies for promotion to a reusable skill
REPLAN cycle	Failed policy checks emit a structured remediation, not a hard stop
Backend-agnostic codegen	Same signature compiles to PySpark, ANSI SQL, or dbt — pluggable via `BackendRegistry`
Artifact hashing	Catalog, policy bundle, and signature are content-hashed and bound to every decision
MCP protocol	Any MCP-compatible agent can call CFA as a governance tool
SQLite + JSONL storage	First-class persistence with stats, retention cleanup, and vacuum
Config auto-discovery	`cfa.yaml` walked up the tree; all CLI commands respect it
Zero core dependencies	Optional extras for `yaml`, `otel`, `mcp`, `llm` — none required for the kernel

CLI

# Governance & evaluation
cfa evaluate "intent" --catalog catalog.json --strict
cfa policy check --signature signature.json --policy-bundle policies/prod.yaml
cfa policy check --signature sig.json --catalog cat.json --strict --audit-log audit.jsonl

# Validation (CI-ready with JSON output and exit codes)
cfa catalog validate catalog.json --require-datasets --format json
cfa signature validate signature.json --format json
cfa policy validate policies/prod.yaml --format json

# Audit & verification
cfa audit show --id INTENT_ID --file audit.jsonl --format json
cfa audit verify --file audit.jsonl

# Policy rules
cfa rules list
cfa rules explain FAULT_CODE

# Storage management
cfa storage stats --db cfa.db --format json
cfa storage cleanup --db cfa.db --retention 90
cfa storage vacuum --db cfa.db

# Lifecycle management
cfa lifecycle evaluate --db cfa.db --window 30
cfa lifecycle list --db cfa.db

# Project health
cfa status --format json

# Bootstrap
cfa init

# Backends
cfa backend list

From Python

from cfa.testing import evaluate, assert_passed

result = evaluate(
    "Join NFe with Clientes and persist to Silver",
    catalog=MY_CATALOG,
    policy_rules=my_rules,
    backend="pyspark",
)
assert_passed(result)

Policy check with audit

from cfa.policy.engine import PolicyEngine
from cfa.types import StateSignature

signature = StateSignature.from_dict(signature_dict)
engine = PolicyEngine(policy_bundle_version="prod-v1.0")
result = engine.evaluate(signature)
# result.action → approve / replan / block

Runtime gate

from cfa.runtime import RuntimeGate, GateConfig

gate = RuntimeGate(
    config=GateConfig(policy_bundle="prod_v1.0", sandbox="mock"),
    catalog=PROD_CATALOG,
)

@gate.guard("aggregate sales with PII protected")
def my_pipeline():
    ...

SQLite storage

from cfa.storage import SqliteStorage

store = SqliteStorage("cfa.db")
store.ensure_schema()

# Audit
store.audit_append(event)

# Execution records (lifecycle)
store.execution_append(record_dict)

# Lifecycle skills
store.skill_upsert("hash_a", skill_data)

Policy Bundles

Declarative YAML policy rules — separate governance from code:

# policies/prod-v1.yaml
policy_bundle:
  version: "prod-v1.0"
  rules:
    - name: forbid_raw_pii
      condition: pii_in_protected_layer
      action: block
      fault_code: GOVERNANCE_RAW_PII
      severity: critical
      message: "PII in protected layer without anonymization."
      remediation:
        - "Apply sha256 on PII columns before the operation"

Validated at load time — unknown conditions, duplicate fault codes, and invalid enums are caught immediately.

Config File

# cfa.yaml (auto-discovered by all commands)
version: "1.0"
storage:
  backend: sqlite
  path: cfa.db
  retention_days: 90
defaults:
  catalog: .cfa/catalog.json
  policy_bundle: .cfa/policies/prod-v1.yaml
  backend: pyspark

Backends

Three governed code generation backends, all pluggable via BackendRegistry:

Backend	Language	Features
`pyspark`	PySpark + Delta Lake	Merge, partition overwrite, PII anonymization
`sql`	ANSI SQL	MERGE INTO, INSERT OVERWRITE, partition clauses
`dbt`	dbt models + schema.yml	Config blocks, refs, not_null/unique tests, PII annotations

Each backend declares its own forbidden tokens for static validation.

MCP Server

Expose CFA governance to any AI agent via Model Context Protocol:

{
  "mcpServers": {
    "cfa": {
      "command": "python",
      "args": ["-m", "cfa.mcp"]
    }
  }
}

5 tools: cfa_evaluate_signature, cfa_describe_rules, cfa_explain_fault, cfa_audit_check, cfa_list_backends.

Repository

src/cfa/
├── core/              Kernel, Planner, CodeGen, Conditions, Phases
├── policy/            PolicyEngine, PolicyBundle, Catalog validation
├── governance/        Standalone governance API (no LLM, no execution required)
├── validation/        Static, Runtime, Signature validation
├── resolution/        Intent → StateSignature resolver (LLM or rule-based backend)
├── normalizer/        Rule-based normalizer, LLM normalizer
├── behavior/          BehaviorSpec + Systematizer (human intent → policy rules)
├── audit/             AuditTrail, Context, Hashing
├── observability/     Metrics, OTel, Notify, Indices, Promotion
├── lifecycle/         IFo/IFs/IFg/IDI indices + Promotion/Demotion engine
├── execution/         Partial execution, State projection
├── adapters/          LangGraph, OpenAI, CrewAI, AutoGen, DSPy
├── backends/          PySpark, SQL, dbt (pluggable)
├── sandbox/           Pluggable sandbox backend + registry + executor
├── cli/               CLI commands by family (core/, governance/, reporting/, project/, infrastructure/)
├── storage/           SQLite + JSONL backends (stats, cleanup, vacuum)
├── mcp/               MCP server (JSON-RPC over stdio)
├── reporting/         HTML reports
├── runtime/           Production governance gate
├── testing/           pytest-native evaluate() + fixtures
├── config.py          CFA config (discovery, defaults)
├── types.py           StateSignature, Fault, KernelResult
└── _lazy.py           Reusable lazy loader for package __init__

Docs

All documentation at marquesantero.github.io/cfa:

Demos

Two complete notebooks, tested on Databricks with CFA v1.0.0, 0 errors:

File	Format	Description
`demos/cfa_demo_complete`	`.dbc` / `.py`	Rule-based governance — APPROVE, REPLAN, BLOCK, codegen, audit, storage
`demos/cfa_llm_demo_complete`	`.dbc` / `.py`	LLM-powered — semantic normalizer, systematizer, strict mode, compare

Import the .dbc into Databricks or run the .py files anywhere.

Contributing

See CONTRIBUTING.md for development setup, test conventions, and the PR checklist. By participating, you agree to the Code of Conduct. Security issues: see SECURITY.md.

License

MIT · Antero Marques

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

1.1.0

Jun 8, 2026

This version

1.0.0

Jun 8, 2026

0.1.9

Jun 8, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cfa_kernel-1.0.0.tar.gz (551.8 kB view details)

Uploaded Jun 8, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

cfa_kernel-1.0.0-py3-none-any.whl (157.8 kB view details)

Uploaded Jun 8, 2026 Python 3

File details

Details for the file cfa_kernel-1.0.0.tar.gz.

File metadata

Download URL: cfa_kernel-1.0.0.tar.gz
Upload date: Jun 8, 2026
Size: 551.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for cfa_kernel-1.0.0.tar.gz
Algorithm	Hash digest
SHA256	`e0dc6fa54c7c82cd50d60d37bffb5b419f758beba41eb3825928d759289962f7`
MD5	`d845199110eec53ae2761ce2b40a5996`
BLAKE2b-256	`5ce721475ce11be619cdd2a6a544906f501ec71558b969b738daec897f99fdd4`

See more details on using hashes here.

File details

Details for the file cfa_kernel-1.0.0-py3-none-any.whl.

File metadata

Download URL: cfa_kernel-1.0.0-py3-none-any.whl
Upload date: Jun 8, 2026
Size: 157.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for cfa_kernel-1.0.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`52f7bb3b3fdb6dfce7aebde05b01ac64ee4a1de63e6b101e271e8534b802d68e`
MD5	`071e7471635a15d8006e8260fc413286`
BLAKE2b-256	`9edba5472391638970d28e81fe7ecab3ba3eb657df3824e32379251d119e5c28`

See more details on using hashes here.

cfa-kernel 1.0.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

CFA v1.0.0

Quick Start

What CFA does

Surfaces

Architecture

Capabilities

CLI

From Python

Policy check with audit

Runtime gate

SQLite storage

Policy Bundles

Config File

Backends

MCP Server

Repository

Docs

Demos

Contributing

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes