A runtime and definition-time security guardrail framework for AI agents and developers.

These details have not been verified by PyPI

License
- OSI Approved :: Apache Software License
Operating System
- OS Independent
Programming Language
- Python :: 3
Topic
- Security

Project description

Agent-Safeguard

A lightweight, enterprise-grade architectural guardrail and sandbox library for Python applications developed in collaboration with AI agents.

The Problem

AI coding assistants (such as Antigravity, Cursor, Copilot, and Codex) excel at generating localized code but operate within limited context windows. Consequently, they lack a comprehensive understanding of global architectural boundaries. Autonomous modifications frequently bypass type contracts, introduce illegal imports, violate filesystem/network access policies, or create infinite resource lockups.

The Solution

Agent-Safeguard establishes programmatic guardrails that enforce structural and runtime invariants. By combining definition-time AST checks, dynamic socket/file monkeypatching, RAM/CPU constraints, database access filters, and prompt injection guards, it captures boundary violations immediately.

Crucially, instead of raising generic stack traces, Agent-Safeguard generates structured JSON reports designed to be ingested by LLM agents, enabling closed-loop, automated self-correction.

Installation

Install using pip:

pip install agent-safeguard

Import it in your Python code using agent_shield:

from agent_shield import virtual_fs, guard_prompt

Core Features & Decorators

1. Advanced Sandboxes & Guardrails

@virtual_fs(in_memory_write=True, allow_real_read=None) Redirects filesystem writes to an in-memory virtual storage in RAM. Filesystem reads check the virtual state first, falling back to the real disk if the path is whitelisted in allow_real_read (defaults to permitting all reads but redirecting all writes to RAM, perfect for safe dry-runs).
@guard_prompt(scan_input=True, scan_output=False, custom_rules=None) Scans function inputs and outputs for prompt injection and developer mode override signatures (e.g. "ignore previous instructions", "system override", "bypass safety"). Raises PromptInjectionViolationError on match.
@restrict_db(read_only=True) Intercepts sqlite3 database connections and restricts execution to read-only queries. If a query contains write/alter keywords (e.g. INSERT, UPDATE, DELETE, DROP, ALTER, CREATE), blocks it and raises DatabaseViolationError.
@restrict_env(allow_mutation=False) Prevents modifying or deleting system environment variables (os.environ) during function execution, raising EnvironmentViolationError.

2. Architectural Integrity & AST Checks

@shield(allowed_imports=..., forbidden_imports=..., allow_unsafe=False, allow_globals=False, max_complexity=None) Enforces function boundary constraints at definition time via static AST analysis:
- Allowed Imports: Whitelist only specific modules for import inside the function.
- Forbidden Imports: Blacklist specific modules (e.g. blocking os or sys).
- Unsafe Execution: Blocks calls to eval() and exec().
- Globals Usage: Blocks the global keyword to prevent global state pollution.
- Hardcoded Secrets: Scans constants for API keys (e.g., AWS, OpenAI) or variables named api_key/secret.
- CPU Lockups: Detects infinite loops with empty bodies (while True: pass).
- Complexity limit: Restricts the maximum allowed cyclomatic complexity of the function's AST.
- Runtime Types: Automatically validates function return values against declared type hints (supports generics and union types).
@freeze Locks the function source code. Registers a cryptographic SHA-256 hash of the function implementation inside shield_reports/frozen_functions.json. Any unauthorized modifications to the code body will raise a ShieldViolationError on startup.
@lock_signature Locks the function's signature. Saves parameter names, ordering, defaults, and type hints in shield_reports/locked_signatures.json to prevent AI from altering the function interface.

3. Resource & Security Sandboxing

@timeout(seconds: float) Enforces a strict runtime execution time limit. Bypasses signal limits gracefully when executed in background threads. Raises TimeoutViolationError if exceeded.
@limit_memory(max_mb: float) Monitors RSS memory growth of the process during execution. If the memory delta exceeds the specified limit, injects MemoryViolationError into the main thread.
@restrict_network(allowed_hosts: list[str]) Restricts socket-level connections. Monkeypatches socket.connect dynamically and thread-safely. Supports wildcards (e.g. *.stripe.com) and resolves domain IPs automatically.
@restrict_fs(allow_read: list[str] = None, allow_write: list[str] = None) Monkeypatches builtins.open and standard file manipulation operations. Prevents path traversal bypasses and whitelists Python interpreter/import folders so package loading remains unimpeded.
@no_side_effects(allow_args_mutation=False, allow_globals=False, allow_stdout=False) Enforces function purity. Verifies that the function does not mutate its arguments, modify module-level globals, or print output to the console. Raises SideEffectViolationError on violation.

4. AI Directives & Semantic Assertions

@prompt_inject(instruction: str) Prepend a standardized, high-visibility block containing architectural instructions directly to the function's docstring:
```
=== AI ASSISTANT ARCHITECTURAL CONSTRAINT ===
{instruction}
=============================================
```
@prompt_assert(prompt: str) Sends the function source code to the Gemini API (gemini-1.5-flash) at definition time to semantically evaluate whether the implementation satisfies the natural language prompt constraint. Supports registry mocking for offline unit testing.

5. Centrally Configured Guardrails (`shield.yaml`)

To prevent AI agents from editing or deleting decorators from Python files, you can define your project rules centrally inside a shield.yaml file on the project root:

rules:
  - pattern: "my_app.payments.*"
    timeout: 5.0
    restrict_network: ["api.stripe.com"]
    virtual_fs: true
    guard_prompt: true
    restrict_db: true
  - pattern: "my_app.utils.*"
    allowed_imports: ["math", "json"]

Agent-Safeguard hooks into Python's import system (builtins.__import__) and automatically decorates all matching module functions at import time.

6. Audit Mode (Passive Mode)

Set the environment variable AGENT_SHIELD_PASSIVE=true to enable passive auditing. Under passive mode, rules write structured JSON reports and output console warnings on violations, but do not raise exceptions (excluding interruptive constraints like timeout).

JSON Diagnostic Reports

When a constraint is violated, Agent-Safeguard writes a diagnostic report to shield_reports/violation_report.json:

{
  "violation_type": "network_violation",
  "function_name": "charge_customer",
  "file_path": "/Users/safik/PycharmProjects/agent-shield/my_app/payments.py",
  "details": {
    "attempted_host": "unauthorized-api.com",
    "allowed_hosts": ["api.stripe.com"]
  },
  "instruction": "AI Assistant Instruction: The function 'charge_customer' in file '/Users/safik/PycharmProjects/agent-shield/my_app/payments.py' attempted to establish an unauthorized network connection to 'unauthorized-api.com'. Connections are restricted to: api.stripe.com. Please remove this network call or connect to an allowed host."
}

AI agents can read this file in a self-correction loop to rewrite their code automatically.

Quick Start

Create a shield.yaml in your project root:

rules:
  - pattern: "sandbox_code.*"
    timeout: 0.1
    allow_read: ["/tmp"]

Define your functions, and Agent-Safeguard handles the rest:

# sandbox_code.py
def process_data():
    # Attempting to read unauthorized file will trigger FileSystemViolationError
    with open("/etc/passwd", "r") as f:
        return f.read()

License

This project is licensed under the Apache License 2.0.

Project details

These details have not been verified by PyPI

License
- OSI Approved :: Apache Software License
Operating System
- OS Independent
Programming Language
- Python :: 3
Topic
- Security

Release history Release notifications | RSS feed

1.0.6

Jun 13, 2026

This version

1.0.5

Jun 13, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agent_safeguard-1.0.5.tar.gz (44.7 kB view details)

Uploaded Jun 13, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

agent_safeguard-1.0.5-py3-none-any.whl (47.2 kB view details)

Uploaded Jun 13, 2026 Python 3

File details

Details for the file agent_safeguard-1.0.5.tar.gz.

File metadata

Download URL: agent_safeguard-1.0.5.tar.gz
Upload date: Jun 13, 2026
Size: 44.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.0

File hashes

Hashes for agent_safeguard-1.0.5.tar.gz
Algorithm	Hash digest
SHA256	`41bde4946055156f688f1a22204e41e62483f0c35d6fbc7a6aef85b2b76d238e`
MD5	`819b14f3654000a273d93b4607856015`
BLAKE2b-256	`9a97f5ef1e4e18f54d4243493b439a834be919f6b7af53d2e8bc04bd421e4626`

See more details on using hashes here.

File details

Details for the file agent_safeguard-1.0.5-py3-none-any.whl.

File metadata

Download URL: agent_safeguard-1.0.5-py3-none-any.whl
Upload date: Jun 13, 2026
Size: 47.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.0

File hashes

Hashes for agent_safeguard-1.0.5-py3-none-any.whl
Algorithm	Hash digest
SHA256	`7ba26c5e466a047e9c8d1db482ecf10adaef65c7fbf9adcfcc7aaeba056b7768`
MD5	`1873102122fa606714f0ba02cccbb78d`
BLAKE2b-256	`53f9f0dcc3a2b6397818062dd844e77512b2296ea22fb828f9f87f5237c033b4`

See more details on using hashes here.

agent-safeguard 1.0.5

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

Agent-Safeguard

The Problem

The Solution

Installation

Core Features & Decorators

1. Advanced Sandboxes & Guardrails

2. Architectural Integrity & AST Checks

3. Resource & Security Sandboxing

4. AI Directives & Semantic Assertions

5. Centrally Configured Guardrails (`shield.yaml`)

6. Audit Mode (Passive Mode)

JSON Diagnostic Reports

Quick Start

License

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

agent-safeguard 1.0.5

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

Agent-Safeguard

The Problem

The Solution

Installation

Core Features & Decorators

1. Advanced Sandboxes & Guardrails

2. Architectural Integrity & AST Checks

3. Resource & Security Sandboxing

4. AI Directives & Semantic Assertions

5. Centrally Configured Guardrails (shield.yaml)

6. Audit Mode (Passive Mode)

JSON Diagnostic Reports

Quick Start

License

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

5. Centrally Configured Guardrails (`shield.yaml`)