The security primitive the agent ecosystem is missing.
Project description
AgentShield
The security primitive the agent ecosystem is missing.
from agentshield import shield
# Wrap any LangChain agent in one line
protected = shield(your_langchain_agent, policy="no_exfiltration")
# Fully protected — prompt injection, goal drift, tool chain escalation
result = protected.run("Summarize and email this document")
What Is AgentShield
AgentShield is a runtime security SDK for AI agents. It is not an agent and not a chatbot. It is a defensive execution layer that wraps existing agent runtimes, intercepts execution events, scores threats, and enforces policy before unsafe behavior reaches tools, memory, or external channels.
This category was missing because agent frameworks optimized for capability and orchestration, not adversarial resilience. As a result, teams had no single primitive for prompt-level abuse detection, multi-step tool-chain control, and forensic-grade runtime telemetry in one place.
Threat Coverage
| Threat | Detection Method | Default Action | Detector Class |
|---|---|---|---|
| Prompt Injection | 3-layer (pattern + semantic + canary) | BLOCK | PromptInjectionDetector |
| Goal Drift | Cosine distance + rolling average | ALERT | GoalDriftDetector |
| Tool Chain Escalation | Forbidden sequence detection | BLOCK | ToolChainDetector |
| Memory Poisoning | Z-score anomaly | ALERT | MemoryPoisonDetector |
| Behavioral Anomalies | Agent DNA fingerprinting | FLAG | DNAAnomalyScorer |
| Inter-Agent Injection | Provenance + trust graph | BLOCK | InterAgentMonitor |
Original Research (not found elsewhere)
- Agent DNA Fingerprinting: Learns a per-agent behavioral baseline from clean sessions and flags statistically meaningful deviations.
- Canary Injection: Uses cryptographic canary tokens to confirm active instruction hijack with near-zero ambiguity.
- Prompt Provenance Tracking: Tags context by trust origin (
TRUSTED,INTERNAL,EXTERNAL,UNTRUSTED) before model execution. - Cryptographic Audit Trail: Hash-chains security events into tamper-evident JSONL for verification and incident reconstruction.
- Red Team CLI: Runs curated adversarial scenarios against live agents and emits structured security reports.
- AgentShield Certify: Converts red-team outcomes into reproducible certification artifacts and badge output.
Installation
pip install agentshield-x
| Extra | Installs | When to use |
|---|---|---|
| [redis] | redis, hiredis | pub/sub + dashboard |
| [otel] | opentelemetry-* | observability export |
| [all] | everything | full feature set |
Quick Example - 3 Policies
# 1. Monitor only (zero risk, great for first run)
protected = shield(agent, policy="monitor_only")
# 2. Block exfiltration attempts
protected = shield(agent, policy="no_exfiltration")
# 3. Maximum security
protected = shield(agent, policy="strict")
# 4. Custom YAML policy
protected = shield(agent, tools=tools, policy="./my_policy.yaml")
Catching Violations
from agentshield import shield
from agentshield.exceptions import PolicyViolationError, PromptInjectionError
protected = shield(agent, policy="strict")
try:
result = protected.run(user_input)
except PromptInjectionError as e:
print(f"Injection blocked: {e}")
except PolicyViolationError as e:
print(f"Policy violation: {e}")
Built-in Policies
| Policy | Blocks | Best For |
|---|---|---|
| monitor_only | Nothing (observe only) | Testing, onboarding |
| no_exfiltration | read→send chains, high injection | Production agents |
| strict | Medium drift, execute tools | High-security envs |
Integrations
LangChain | LlamaIndex | AutoGen | OpenAI | Anthropic
# LangChain
from agentshield import shield; protected = shield(langchain_agent, policy="monitor_only")
# LlamaIndex
from agentshield import shield; protected = shield(llamaindex_agent, policy="monitor_only")
# AutoGen
from agentshield import shield; protected = shield(autogen_agent, policy="monitor_only")
# OpenAI
from agentshield import shield; protected = shield(openai_client, policy="monitor_only")
# Anthropic
from agentshield import shield; protected = shield(anthropic_client, policy="monitor_only")
Red Team CLI
# List available attack scenarios
agentshield attack list
# Run a specific attack
agentshield attack run --scenario prompt_injection --target my_agent.py
# Generate certification report
agentshield certify --agent my_agent.py --policy no_exfiltration
Architecture Diagram
User Input
|
v
┌─────────────────────────────────────────────────────┐
│ AgentShield Runtime │
│ │
│ LLM Hook -> Tool Hook -> Memory Hook │
│ | | | │
│ └───────────┴───────────┘ │
│ | │
│ DetectionEngine │
│ ┌──────────────────────────────┐ │
│ │ Canary -> DNA -> Provenance │ │
│ │ PromptInjection Detector │ │
│ │ GoalDrift Detector │ │
│ │ ToolChain Detector │ │
│ │ MemoryPoison Detector │ │
│ │ InterAgent Monitor │ │
│ └──────────────────────────────┘ │
│ | │
│ Cross-Correlation │
│ | │
│ PolicyEvaluator │
│ | │
│ BLOCK / ALERT / LOG │
└─────────────────────────────────────────────────────┘
|
v
Agent continues (or PolicyViolationError raised)
Documentation
- Full Docs: https://AdityaBelhekar.github.io/AgentShield
- Quickstart: ./QUICKSTART.md
- SDK Reference: ./docs/sdk-reference.md
- Contributing: ./CONTRIBUTING.md
Exception Hierarchy
AgentShieldError
├── ConfigurationError
├── AdapterError
├── InterceptorError
├── DetectionError
├── EventEmissionError
├── RedisConnectionError
├── ProvenanceError
├── CanaryError
├── DNAError
├── AuditChainError
└── PolicyViolationError
├── ToolCallBlockedError
│ └── PrivilegeEscalationError
├── GoalDriftError
├── PromptInjectionError
├── MemoryPoisonError
├── BehavioralAnomalyError
└── InterAgentInjectionError
License + Author
MIT License. Built by Aditya Belhekar. AgentShield is and will always remain free and open-source.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file agentshield_x-0.1.0.tar.gz.
File metadata
- Download URL: agentshield_x-0.1.0.tar.gz
- Upload date:
- Size: 138.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3610881ddb1e19f590940d7e76765cd9c5fc932afbdf098727e2ca0be68e0fb6
|
|
| MD5 |
8c2ccce83c7833cde6b6143e18404983
|
|
| BLAKE2b-256 |
abefced288ab39728846bde50cbf3d2d524a30445d3778e216527baa70f7e6be
|
Provenance
The following attestation bundles were made for agentshield_x-0.1.0.tar.gz:
Publisher:
publish.yml on AdityaBelhekar/AgentShield
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
agentshield_x-0.1.0.tar.gz -
Subject digest:
3610881ddb1e19f590940d7e76765cd9c5fc932afbdf098727e2ca0be68e0fb6 - Sigstore transparency entry: 1383306832
- Sigstore integration time:
-
Permalink:
AdityaBelhekar/AgentShield@4e42045faf5e6ac340c589f14368d1bb2c3372eb -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/AdityaBelhekar
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@4e42045faf5e6ac340c589f14368d1bb2c3372eb -
Trigger Event:
push
-
Statement type:
File details
Details for the file agentshield_x-0.1.0-py3-none-any.whl.
File metadata
- Download URL: agentshield_x-0.1.0-py3-none-any.whl
- Upload date:
- Size: 186.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
912afe14cb64395fe5a51e02d6b8f63e82e2e1e1aad120d49922db4c5bb5cf9f
|
|
| MD5 |
bfd69c7f7ad1683e10c5f65ca94760ce
|
|
| BLAKE2b-256 |
a8bfb0dea15401b38e172016779c43546d6859562f813603e2d574764da21df7
|
Provenance
The following attestation bundles were made for agentshield_x-0.1.0-py3-none-any.whl:
Publisher:
publish.yml on AdityaBelhekar/AgentShield
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
agentshield_x-0.1.0-py3-none-any.whl -
Subject digest:
912afe14cb64395fe5a51e02d6b8f63e82e2e1e1aad120d49922db4c5bb5cf9f - Sigstore transparency entry: 1383306849
- Sigstore integration time:
-
Permalink:
AdityaBelhekar/AgentShield@4e42045faf5e6ac340c589f14368d1bb2c3372eb -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/AdityaBelhekar
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@4e42045faf5e6ac340c589f14368d1bb2c3372eb -
Trigger Event:
push
-
Statement type: