AI Action Firewall — seven-stage Decision Intelligence Core for safe agentic AI

These details have not been verified by PyPI

Project links

Project description

AGI Pragma

AI Action Firewall — Safe execution layer for AI agents

AGI Pragma prevents AI agents from executing dangerous actions before they happen.

Quick Start

1 — Install

pip install agi-pragma

2 — Python SDK

from agi_pragma import DICGovernor, FileAction, FileOp

gov = DICGovernor()

# WRITE — approved (RPN 504, below threshold 2400)
decision = gov.evaluate(FileAction(
    op=FileOp.WRITE, path="plan.md",
    content="project notes", reason="save draft"
))
print(decision.approved, decision.max_rpn)       # True  504

# DELETE — blocked (RPN 3150, exceeds threshold 2400)
decision = gov.evaluate(FileAction(
    op=FileOp.DELETE, path="users.csv", reason="clean up"
))
print(decision.approved, decision.block_reason)  # False  RPN 3150 ≥ threshold 2400

# Full audit trace — every stage logged
for stage in decision.stage_log:
    print(stage["stage"], stage)

3 — REST API

pip install "agi-pragma[api]"
uvicorn demos.dic_api.main:app --reload

curl -s -X POST http://localhost:8000/evaluate \
  -H "Content-Type: application/json" \
  -d '{"op": "delete", "path": "users.csv", "reason": "clean up"}' \
  | python3 -m json.tool

{
  "approved": false,
  "block_reason": "RPN 3150 ≥ threshold 2400",
  "max_rpn": 3150,
  "utility": -7.75
}

4 — LangGraph

pip install "agi-pragma[langgraph]"

from langgraph.graph import StateGraph
from agi_pragma.integrations.langgraph import DICGuardNode, dic_conditional_edge

guard = DICGuardNode()   # one shared governor across the whole graph

graph = StateGraph(AgentState)
graph.add_node("agent",     agent_node)
graph.add_node("dic_guard", guard)
graph.add_node("tools",     tool_node)

graph.set_entry_point("agent")
graph.add_edge("agent", "dic_guard")

# approved → execute tools   blocked → agent re-plans
graph.add_conditional_edges(
    "dic_guard",
    dic_conditional_edge,
    {"approved": "tools", "blocked": "agent"},
)

See docs/integrations/langgraph.md.

5 — AutoGen

pip install "agi-pragma[autogen]"

from autogen_core.tools import FunctionTool
from autogen_agentchat.agents import AssistantAgent
from agi_pragma.integrations.autogen import dic_wrap_tools

# Wrap existing tools — DIC evaluates every call before execution
safe_tools = dic_wrap_tools([
    FunctionTool(write_file,  description="Write a file"),
    FunctionTool(delete_file, description="Delete a file"),
    FunctionTool(read_file,   description="Read a file"),
])

agent = AssistantAgent(
    name="file_agent",
    model_client=model_client,
    tools=safe_tools,          # drop-in replacement
)

See docs/integrations/autogen.md.

6 — LlamaIndex

pip install "agi-pragma[llamaindex]"

from llama_index.core.agent import ReActAgent
from llama_index.core.tools import FunctionTool
from agi_pragma.integrations.llamaindex import dic_wrap_tools

safe_tools = dic_wrap_tools([
    FunctionTool.from_defaults(fn=write_file,  name="write_file",  description="Write a file"),
    FunctionTool.from_defaults(fn=delete_file, name="delete_file", description="Delete a file"),
    FunctionTool.from_defaults(fn=read_file,   name="read_file",   description="Read a file"),
])

agent = ReActAgent(tools=safe_tools, llm=llm)   # drop-in replacement

See docs/integrations/llamaindex.md.

Overview

AGI Pragma is an AI Action Firewall: a structured pre-execution governance layer that sits between an AI agent and the real world, evaluating every proposed action for risk before it is allowed to execute.

It does not attempt to replicate human cognition, consciousness, or emotions.
Instead, it enforces systematic risk evaluation at the point of action: filtering proposals, scoring failure modes, and blocking irreversible operations before they cause harm.

An AI agent that cannot delete a database table it shouldn't delete, overwrite a file it shouldn't overwrite, or execute a command it shouldn't execute — not because it was prompted to behave, but because a hard enforcement layer blocked it.

What AGI Pragma Is / Is Not

AGI Pragma IS

an AI Action Firewall — hard pre-execution enforcement for agentic AI systems
a Decision Intelligence Core (DIC) built around explicit, auditable decision gates
a research artifact with reproducible benchmarks and full audit traces per decision
a foundation for safety-oriented autonomous systems and LLM agent governance

AGI Pragma IS NOT

a human-like AGI
a black-box learning system
a reward-maximization benchmark
a production-ready general intelligence

Core Architecture — Decision Intelligence Core (DIC)

Each decision follows a fixed and auditable pipeline:

1. Branching — enumerate feasible actions, eliminate invalid ones.

2. Critical Path Estimation — Monte Carlo rollouts estimate:

probability of catastrophic failure,
probability of entering irreversible traps,
expected steps until failure.

3. Risk Assessment (FMEA) — each action scored by:

Severity (S) × Occurrence (O) × Detection difficulty (D) = RPN

4. Decision Integrity Gate — actions exceeding risk threshold are blocked before execution.

5. Circuit Breaker — autonomy dynamically constrained:

OK → WARN → SLOW → STOP → ESCALATE

6. Decision Selection — utility balances survival probability, goal progress, residual risk.

7. Belief Update — Bayesian trackers update internal hazard estimates.

Circuit Breaker States

The circuit breaker escalates session-wide across all tool calls, not per-action.

State	Trigger	Effect
OK	RPN < 1800	Action approved normally
WARN	RPN ≥ 1800	Approved with warning logged; 3 consecutive WARNs → SLOW
SLOW	RPN ≥ 2200	Approved with reduced autonomy; 2 consecutive SLOWs → STOP
STOP	RPN ≥ 2600	Action blocked; `approved=False`
ESCALATE	3 consecutive STOPs	All candidates unsafe — `approved=False`, `block_reason="ESCALATE: all actions exceed risk threshold, human confirmation required"`

ESCALATE is the signal that the agent is stuck: every action it proposes is unsafe. The system stops and waits for human confirmation rather than selecting the least-bad option. The counter resets after ESCALATE fires.

from agi_pragma import DICGovernor, FileAction, FileOp

gov = DICGovernor()

for path in ["users.csv", "backups.zip", "prod.db"]:   # three consecutive DELETEs
    d = gov.evaluate(FileAction(op=FileOp.DELETE, path=path, reason="cleanup"))
    print(d.circuit_breaker.state.value, d.block_reason)

# stop     STOP: RPN 4410 ≥ 2600
# stop     STOP: RPN 4410 ≥ 2600
# escalate ESCALATE: all actions exceed risk threshold, human confirmation required

print(gov.escalation_count)   # 1

Run the built-in ESCALATE demo:

python3 -m demos.dic_llm.run --mock --scenario escalate

Safety Model

Safety in AGI Pragma is preventive, not reactive.

self-harm equals failure,
no action bypasses risk evaluation,
all decisions are auditable,
autonomy is conditional, not absolute.

See: docs/safety.md

Benchmark Results — Snake

Agent: PragmaSnakeAgent
Environment: SnakeEnv 10×10

v1.0 — 50 episodes (2026-04-05)

Metric	Value
Average score	22.8
Min / Max score	9 / 33
Average reward	102.4
Average steps	201
Survived to step limit	2/50
Scores ≥ 25	21/50 (42%)
Scores < 15	4/50 (8%)

v0.1 — 10 episodes (2026-04-04) — initial run

Metric	Value
Average score	25.0
Min / Max score	18 / 33
Average reward	113.1
Average steps	214
Survived to step limit	0/10

Note: v0.1 used only 10 seeds — higher average reflects small sample size. v1.0 with 50 seeds gives a more reliable picture of agent behavior.

Key finding — passive vs active agent

Config	Avg score	Avg reward
dist weight = 0.2 (passive)	0.4	~0
dist weight = 1.5 (active)	22.8	102.4

One parameter change produced a 57× improvement in score.

Interpretation

42% of episodes scored 25 or above.
Only 8% of episodes scored below 15 — rare failures, not systemic.
The agent accepts risk to pursue goals and dies actively, not passively.

This confirms the core AGI Pragma trade-off:
safety ≠ passivity. Controlled risk is required for goal achievement.

To run the benchmark (50 episodes, results written to artifacts/snake/):

python3 -m benchmarks.snake.run

See: docs/benchmarks/snake.md

Benchmark Results — Maze

Agent: PragmaMazeAgent
Environment: MazeEnv 15×15 (recursive backtracker generation)

v2.0 — 50 episodes (2026-04-05)

Metric	Value
Solved	50 / 50 (100%)
Steps — avg / min / max	46.1 / 24 / 76
Score (steps remaining) — avg / min / max	253.9 / 224 / 276

Key finding — BFS distance vs manhattan distance

Utility signal	Solved	Avg steps
Manhattan distance (v1.1)	4/50 (8%)	277.9
BFS path distance (v2.0)	50/50 (100%)	46.1

One signal change produced a 12.5× reduction in steps and a 100% solve rate.

Interpretation

Manhattan distance is unreliable in mazes where walls force long detours. Replacing it with exact BFS path distance — precomputed once per maze, O(1) per lookup — gave the utility function accurate topological information and immediately solved all episodes.

The FMEA and circuit breaker operated correctly throughout; the failure in v1.x was a utility signal problem, not a safety pipeline problem.

To run the benchmark (50 episodes, results written to artifacts/maze/):

python3 -m benchmarks.maze.run

See: docs/benchmarks/maze.md

Benchmark Results — Dynamic Threat Gridworld

Agent: PragmaGridworldAgent
Environment: GridworldEnv 15×15, 5 wandering hazards

v1.0 — 50 episodes (2026-04-06)

Metric	Value
Solved	39 / 50 (78%)
Killed by hazard	11 / 50 (22%)
Timed out	0 / 50
Steps — avg / min / max	22.8 / 9 / 24
Score when solved (steps remaining)	276

Key finding — p_death signal is load-bearing

Unlike Snake and Maze where the Monte Carlo risk signal was saturated or secondary, the gridworld is the first benchmark where p_death varies meaningfully across candidate actions at each step. Moving toward a hazard cluster scores higher p_death than WAIT or evasion — the FMEA and Critical Path stages are actively driving decisions, not just gating them.

The circuit breaker operates in WARN/SLOW range throughout (RPN 180–200), constraining autonomy proportionally without collapsing into full conservatism.

Interpretation

The 22% failure rate reflects genuine stochastic risk — some hazard configurations cross the direct path regardless of decision quality. Zero timeouts confirms the agent always makes decisive forward progress.

Safety ≠ passivity holds across all three benchmarks: the agent accepts risk to pursue the goal and the safety pipeline constrains, not blocks, autonomous action.

To run the benchmark (50 episodes, results written to artifacts/gridworld/):

python3 -m benchmarks.gridworld.run

See: docs/benchmarks/gridworld.md

Methodology

See: docs/Methodology.md

Reproducibility

Each benchmark run produces:

decision-level logs (JSONL)
episode summaries (JSON)
reproducible configurations

Related Projects

ChaosGym / Reverse Reality Sandbox — physics-breaking simulation environment designed to stress-test AGI Pragma's decision integrity under non-stationary rules.

AGI-Development — iterative development history and experimental branches of the AGI Pragma framework
developmental-agi-sandbox — Unity-based reverse-physics sandbox environment for testing AGI Pragma under non-stationary world rules

Licensing & Commercial Use

Author: Rafał Żabiński

Free use: academic research, education, non-commercial projects, open-source experimentation.

Commercial use: requires a separate written agreement with the author.

zabinskirafal@outlook.com
https://www.linkedin.com/in/zabinskirafal

Project Status

Current version: v3.0.0

AGI Pragma is an active research program, not a finished product.

Future work includes additional benchmarks, stronger baselines, and formal evaluation protocols.

Citation

If you use this work in research, please cite via: CITATION.cff

Rafał Żabiński — Founder and original author (January 2026)

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

1.2.0

Apr 13, 2026

This version

1.1.0

Apr 12, 2026

1.0.0

Apr 8, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agi_pragma-1.1.0.tar.gz (54.3 kB view details)

Uploaded Apr 12, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

agi_pragma-1.1.0-py3-none-any.whl (67.6 kB view details)

Uploaded Apr 12, 2026 Python 3

File details

Details for the file agi_pragma-1.1.0.tar.gz.

File metadata

Download URL: agi_pragma-1.1.0.tar.gz
Upload date: Apr 12, 2026
Size: 54.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for agi_pragma-1.1.0.tar.gz
Algorithm	Hash digest
SHA256	`a06b7143a4a5d18020a9dc5c94ec63b92fa9291f31e2a6eefd4bb9c15bfac52f`
MD5	`47fa3e6dc958f91df4f152c192a0fac0`
BLAKE2b-256	`f23ce0f7d622fb6798920f7ee782283e4536e1b0c98d076fb9516aa73c1e6ed0`

See more details on using hashes here.

File details

Details for the file agi_pragma-1.1.0-py3-none-any.whl.

File metadata

Download URL: agi_pragma-1.1.0-py3-none-any.whl
Upload date: Apr 12, 2026
Size: 67.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for agi_pragma-1.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`d4b4e5e1d49bd5f4c5c2a2461663bd35daf9e8f3307e16237a22fd73243f4848`
MD5	`cf8b79efa5a3f65a64b5dc16d1b3f30c`
BLAKE2b-256	`c343d0582af3211732843c1c20d16e9c3f0ba684fff8215dfa35efe1d171c95d`

See more details on using hashes here.

agi-pragma 1.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

AGI Pragma

Quick Start

1 — Install

2 — Python SDK

3 — REST API

4 — LangGraph

5 — AutoGen

6 — LlamaIndex

Overview

What AGI Pragma Is / Is Not

AGI Pragma IS

AGI Pragma IS NOT

Core Architecture — Decision Intelligence Core (DIC)

Circuit Breaker States

Safety Model

Benchmark Results — Snake

v1.0 — 50 episodes (2026-04-05)

v0.1 — 10 episodes (2026-04-04) — initial run

Key finding — passive vs active agent

Interpretation

Benchmark Results — Maze

v2.0 — 50 episodes (2026-04-05)

Key finding — BFS distance vs manhattan distance

Interpretation

Benchmark Results — Dynamic Threat Gridworld

v1.0 — 50 episodes (2026-04-06)

Key finding — p_death signal is load-bearing

Interpretation

Methodology

Reproducibility

Related Projects

Licensing & Commercial Use

Project Status

Citation

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes