Skip to main content

The security primitive the agent ecosystem is missing.

Project description

PyPI Python License Docs Stars

AgentShield

The security primitive the agent ecosystem is missing.

from agentshield import shield

# Wrap any LangChain agent in one line
protected = shield(your_langchain_agent, policy="no_exfiltration")

# Fully protected — prompt injection, goal drift, tool chain escalation
result = protected.run("Summarize and email this document")

What Is AgentShield

AgentShield is a runtime security SDK for AI agents. It is not an agent and not a chatbot. It is a defensive execution layer that wraps existing agent runtimes, intercepts execution events, scores threats, and enforces policy before unsafe behavior reaches tools, memory, or external channels.

This category was missing because agent frameworks optimized for capability and orchestration, not adversarial resilience. As a result, teams had no single primitive for prompt-level abuse detection, multi-step tool-chain control, and forensic-grade runtime telemetry in one place.

Threat Coverage

Threat Detection Method Default Action Detector Class
Prompt Injection 3-layer (pattern + semantic + canary) BLOCK PromptInjectionDetector
Goal Drift Cosine distance + rolling average ALERT GoalDriftDetector
Tool Chain Escalation Forbidden sequence detection BLOCK ToolChainDetector
Memory Poisoning Z-score anomaly ALERT MemoryPoisonDetector
Behavioral Anomalies Agent DNA fingerprinting FLAG DNAAnomalyScorer
Inter-Agent Injection Provenance + trust graph BLOCK InterAgentMonitor

Original Research (not found elsewhere)

  • Agent DNA Fingerprinting: Learns a per-agent behavioral baseline from clean sessions and flags statistically meaningful deviations.
  • Canary Injection: Uses cryptographic canary tokens to confirm active instruction hijack with near-zero ambiguity.
  • Prompt Provenance Tracking: Tags context by trust origin (TRUSTED, INTERNAL, EXTERNAL, UNTRUSTED) before model execution.
  • Cryptographic Audit Trail: Hash-chains security events into tamper-evident JSONL for verification and incident reconstruction.
  • Red Team CLI: Runs curated adversarial scenarios against live agents and emits structured security reports.
  • AgentShield Certify: Converts red-team outcomes into reproducible certification artifacts and badge output.

Installation

pip install agentshield-x
Extra Installs When to use
[redis] redis, hiredis pub/sub + dashboard
[otel] opentelemetry-* observability export
[all] everything full feature set

Quick Example - 3 Policies

# 1. Monitor only (zero risk, great for first run)
protected = shield(agent, policy="monitor_only")
# 2. Block exfiltration attempts
protected = shield(agent, policy="no_exfiltration")
# 3. Maximum security
protected = shield(agent, policy="strict")
# 4. Custom YAML policy
protected = shield(agent, tools=tools, policy="./my_policy.yaml")

Catching Violations

from agentshield import shield
from agentshield.exceptions import PolicyViolationError, PromptInjectionError

protected = shield(agent, policy="strict")

try:
  result = protected.run(user_input)
except PromptInjectionError as e:
  print(f"Injection blocked: {e}")
except PolicyViolationError as e:
  print(f"Policy violation: {e}")

Built-in Policies

Policy Blocks Best For
monitor_only Nothing (observe only) Testing, onboarding
no_exfiltration read→send chains, high injection Production agents
strict Medium drift, execute tools High-security envs

Integrations

LangChain | LlamaIndex | AutoGen | OpenAI | Anthropic

# LangChain
from agentshield import shield; protected = shield(langchain_agent, policy="monitor_only")
# LlamaIndex
from agentshield import shield; protected = shield(llamaindex_agent, policy="monitor_only")
# AutoGen
from agentshield import shield; protected = shield(autogen_agent, policy="monitor_only")
# OpenAI
from agentshield import shield; protected = shield(openai_client, policy="monitor_only")
# Anthropic
from agentshield import shield; protected = shield(anthropic_client, policy="monitor_only")

Red Team CLI

# List available attack scenarios
agentshield attack list

# Run a specific attack
agentshield attack run --scenario prompt_injection --target my_agent.py

# Generate certification report
agentshield certify --agent my_agent.py --policy no_exfiltration

Architecture Diagram

User Input
  |
  v
┌─────────────────────────────────────────────────────┐
│                  AgentShield Runtime                │
│                                                     │
│  LLM Hook -> Tool Hook -> Memory Hook              │
│       |           |           |                     │
│       └───────────┴───────────┘                     │
│                   |                                  │
│            DetectionEngine                           │
│     ┌──────────────────────────────┐                 │
│     │  Canary -> DNA -> Provenance │                 │
│     │  PromptInjection Detector    │                 │
│     │  GoalDrift Detector          │                 │
│     │  ToolChain Detector          │                 │
│     │  MemoryPoison Detector       │                 │
│     │  InterAgent Monitor          │                 │
│     └──────────────────────────────┘                 │
│                   |                                  │
│           Cross-Correlation                          │
│                   |                                  │
│           PolicyEvaluator                            │
│                   |                                  │
│          BLOCK / ALERT / LOG                         │
└─────────────────────────────────────────────────────┘
  |
  v
Agent continues (or PolicyViolationError raised)

Documentation

Exception Hierarchy

AgentShieldError
├── ConfigurationError
├── AdapterError
├── InterceptorError
├── DetectionError
├── EventEmissionError
├── RedisConnectionError
├── ProvenanceError
├── CanaryError
├── DNAError
├── AuditChainError
└── PolicyViolationError
  ├── ToolCallBlockedError
  │   └── PrivilegeEscalationError
  ├── GoalDriftError
  ├── PromptInjectionError
  ├── MemoryPoisonError
  ├── BehavioralAnomalyError
  └── InterAgentInjectionError

License + Author

MIT License. Built by Aditya Belhekar. AgentShield is and will always remain free and open-source.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agentshield_x-0.1.0.tar.gz (138.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

agentshield_x-0.1.0-py3-none-any.whl (186.7 kB view details)

Uploaded Python 3

File details

Details for the file agentshield_x-0.1.0.tar.gz.

File metadata

  • Download URL: agentshield_x-0.1.0.tar.gz
  • Upload date:
  • Size: 138.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for agentshield_x-0.1.0.tar.gz
Algorithm Hash digest
SHA256 3610881ddb1e19f590940d7e76765cd9c5fc932afbdf098727e2ca0be68e0fb6
MD5 8c2ccce83c7833cde6b6143e18404983
BLAKE2b-256 abefced288ab39728846bde50cbf3d2d524a30445d3778e216527baa70f7e6be

See more details on using hashes here.

Provenance

The following attestation bundles were made for agentshield_x-0.1.0.tar.gz:

Publisher: publish.yml on AdityaBelhekar/AgentShield

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file agentshield_x-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: agentshield_x-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 186.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for agentshield_x-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 912afe14cb64395fe5a51e02d6b8f63e82e2e1e1aad120d49922db4c5bb5cf9f
MD5 bfd69c7f7ad1683e10c5f65ca94760ce
BLAKE2b-256 a8bfb0dea15401b38e172016779c43546d6859562f813603e2d574764da21df7

See more details on using hashes here.

Provenance

The following attestation bundles were made for agentshield_x-0.1.0-py3-none-any.whl:

Publisher: publish.yml on AdityaBelhekar/AgentShield

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page