Security layer for agentic AI systems
Project description
Theron
Security proxy for agentic AI systems. Detects prompt injection, sandboxes dangerous actions, and learns agent behavior - all without modifying your AI agent's code.
The Problem
AI agents (Claude Code, AutoGPT, Moltbot) can execute shell commands, send emails, and access files. When they process untrusted content (emails, web pages, documents), prompt injection attacks can hijack them into executing malicious commands.
Theron's principle: Content the AI reads should never have the same privilege level as commands the user issues.
Features
Core Security
- Prompt Injection Detection - 50+ patterns detecting instruction overrides, role injection, delimiter attacks, exfiltration attempts
- 4-Tier Risk Classification - Tools classified from Safe (read_file) to Critical (sudo, transfer_funds)
- Source-Based Gating - Actions allowed/blocked based on trust level of input content
- Docker Sandboxing - Dangerous commands run in isolated containers with no network, read-only FS, memory limits
Intelligence Layer
- Causal Chain Tracking - Trace how untrusted content leads to dangerous actions
- Exfiltration Detection - Detect sensitive data (credentials, keys, PII) flowing to outbound tools
- Hijack Detection - Detect when agent tasks drift from original user intent
- Honeypot Injection - Seed fake credentials, detect if agent uses them (indicates compromise)
- Taint Tracking - Track which knowledge came from untrusted sources
Learning & Autonomy
- Behavioral Baseline - Learn normal patterns per agent, flag anomalies with zero config
- Task-Scoped Permissions - Dynamically restrict tools based on inferred task (coding vs email)
- Shadow Execution - Run actions in isolation, auto-commit/discard based on behavior analysis
- Graceful Degradation - Automatically reduce agent autonomy when threats detected
Agent Management
- Agent Registry - Modular agent definitions with risk levels and capabilities
- Guided Installation - Safe onboarding with warnings and automatic Theron configuration
- Protected Runner - Launch any agent with Theron protection via
theron run
Installation
pip install theron
theron setup
Restart your terminal. Done. All your AI agents are now protected.
What Happens
After theron setup:
- Theron starts automatically when you log in
- All AI agents that use Anthropic or OpenAI APIs are automatically routed through Theron
- Dangerous actions from untrusted content are blocked
- No configuration needed
Just use your AI agents normally:
claude # Protected
moltbot # Protected
your-agent # Protected
Manual Mode (Advanced)
If you prefer not to use automatic setup:
theron # Start proxy + dashboard manually
Then set environment variables:
export ANTHROPIC_API_URL=http://localhost:8081
export OPENAI_API_BASE=http://localhost:8081/v1
How It Works
┌──────────────────────────────────────────────────────────────────┐
│ YOUR COMPUTER │
│ │
│ ┌─────────┐ ┌──────────────┐ ┌─────────────────────┐ │
│ │ AI │ ───▶ │ THERON │ ───▶ │ api.anthropic.com │ │
│ │ Agent │ ◀─── │ Proxy │ ◀─── │ api.openai.com │ │
│ └─────────┘ └──────────────┘ └─────────────────────┘ │
│ │ │
│ ┌────────────────┼────────────────┐ │
│ │ │ │ │
│ ┌────▼────┐ ┌──────▼──────┐ ┌─────▼─────┐ │
│ │Dashboard│ │ Intelligence │ │ Sandbox │ │
│ │ :8080 │ │ Layer │ │ (Docker) │ │
│ └─────────┘ └─────────────┘ └───────────┘ │
└──────────────────────────────────────────────────────────────────┘
- Agent sends request to
localhost:8081 - Theron tags messages with trust levels, detects injection attempts
- Request forwards to real LLM API
- Response analyzed - tool calls classified by risk
- Intelligence layer evaluates: causal chains, exfiltration, hijack, honeypots, taints
- Dangerous actions from untrusted content get sandboxed or blocked
- Behavioral baseline updated, anomalies flagged
- Dashboard shows real-time events and alerts
Policy Matrix
Actions are allowed/blocked based on source trust × action risk:
| Source ↓ / Risk → | Tier 1 (Safe) | Tier 2 (Moderate) | Tier 3 (Sensitive) | Tier 4 (Critical) |
|---|---|---|---|---|
| USER_DIRECT | Allow | Allow | Allow | Log |
| USER_INDIRECT | Allow | Allow | Log | Sandbox |
| CONTENT_READ | Allow | Log | Sandbox | Sandbox |
| TOOL_RESULT | Allow | Log | Sandbox | Sandbox |
Enhanced gating adds composite risk scoring from injection detection, honeypot triggers, exfiltration attempts, intent drift, and behavioral anomalies.
CLI Commands
# Setup (one-time)
theron setup # Configure automatic protection
theron setup --status # Check setup status
theron setup --uninstall # Remove setup
# Manual mode
theron # Start proxy + dashboard
theron proxy # Just proxy
theron dashboard # Just dashboard
Testing
pytest tests/ -v # 151 tests
Configuration
Config stored at ~/.theron/config.yaml:
proxy:
listen_port: 8081
detection:
sensitivity: 5 # 1-10
injection_threshold: 70 # 0-100
classification:
unknown_tool_tier: 3 # Default tier for unknown tools
gating:
whitelist: [get_weather]
blacklist: [format_disk]
learning:
enabled: true
baseline_requests: 25 # Requests before baseline is established
Dashboard
The web dashboard at http://localhost:8080 provides:
- Events - Real-time feed of all security events via WebSocket
- Blocked Actions - Log of dangerous actions that were automatically blocked
- Intelligence - Causal chains, alerts, honeypot stats, taint reports
- Profiles - Per-agent behavioral baselines and anomalies
- Statistics - Charts and summaries
Note: Theron is fully automatic. Dangerous actions are blocked without requiring user approval - the dashboard is for visibility, not decision-making.
API Reference
Proxy (port 8081)
POST /v1/messages- Anthropic APIPOST /v1/chat/completions- OpenAI APIGET /health- Health check
Dashboard (port 8080)
GET /api/events- List eventsGET /api/sandbox/blocked- Recently blocked actionsGET /api/intelligence/summary- Intelligence overviewGET /api/agents/{id}/profile- Agent behavioral profileWS /api/events/stream- Real-time event stream
See CLAUDE.md for full API reference.
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file theron-0.2.0.tar.gz.
File metadata
- Download URL: theron-0.2.0.tar.gz
- Upload date:
- Size: 144.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
35ef9558b1cd517866b6a728a5cf12a1bd53b8189db6d1f86177f0bc24b929e5
|
|
| MD5 |
4c9105039e39cf17c6bbc6e46a8283bb
|
|
| BLAKE2b-256 |
f3bd9ef24452f400f08fe710ffb7f2907e7f67a22b64f1c27a0be61a31653976
|
Provenance
The following attestation bundles were made for theron-0.2.0.tar.gz:
Publisher:
publish.yml on Mukund2/theron
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
theron-0.2.0.tar.gz -
Subject digest:
35ef9558b1cd517866b6a728a5cf12a1bd53b8189db6d1f86177f0bc24b929e5 - Sigstore transparency entry: 869586255
- Sigstore integration time:
-
Permalink:
Mukund2/theron@c758315ced85291a598d54ede0a434fa0f11eb3b -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/Mukund2
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@c758315ced85291a598d54ede0a434fa0f11eb3b -
Trigger Event:
release
-
Statement type:
File details
Details for the file theron-0.2.0-py3-none-any.whl.
File metadata
- Download URL: theron-0.2.0-py3-none-any.whl
- Upload date:
- Size: 122.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
36d29190fd8987d9b1710542d4e75c4b6938f0f920dae50c48382e77d10bbe6c
|
|
| MD5 |
0d3cac720e9294979956f3515588a021
|
|
| BLAKE2b-256 |
d703c1889fcdf5e0841188ba80f4e5c00d5b051782a24778cd4794d5dfc25802
|
Provenance
The following attestation bundles were made for theron-0.2.0-py3-none-any.whl:
Publisher:
publish.yml on Mukund2/theron
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
theron-0.2.0-py3-none-any.whl -
Subject digest:
36d29190fd8987d9b1710542d4e75c4b6938f0f920dae50c48382e77d10bbe6c - Sigstore transparency entry: 869586268
- Sigstore integration time:
-
Permalink:
Mukund2/theron@c758315ced85291a598d54ede0a434fa0f11eb3b -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/Mukund2
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@c758315ced85291a598d54ede0a434fa0f11eb3b -
Trigger Event:
release
-
Statement type: