Security layer for agentic AI systems
Project description
Theron
Security proxy for agentic AI systems. Detects prompt injection, sandboxes dangerous actions, and learns agent behavior - all without modifying your AI agent's code.
The Problem
AI agents (Claude Code, AutoGPT, Moltbot) can execute shell commands, send emails, and access files. When they process untrusted content (emails, web pages, documents), prompt injection attacks can hijack them into executing malicious commands.
Theron's principle: Content the AI reads should never have the same privilege level as commands the user issues.
Features
Core Security
- Prompt Injection Detection - 50+ patterns detecting instruction overrides, role injection, delimiter attacks, exfiltration attempts
- 4-Tier Risk Classification - Tools classified from Safe (read_file) to Critical (sudo, transfer_funds)
- Source-Based Gating - Actions allowed/blocked based on trust level of input content
- Docker Sandboxing - Dangerous commands run in isolated containers with no network, read-only FS, memory limits
Intelligence Layer
- Causal Chain Tracking - Trace how untrusted content leads to dangerous actions
- Exfiltration Detection - Detect sensitive data (credentials, keys, PII) flowing to outbound tools
- Hijack Detection - Detect when agent tasks drift from original user intent
- Honeypot Injection - Seed fake credentials, detect if agent uses them (indicates compromise)
- Taint Tracking - Track which knowledge came from untrusted sources
Autonomous Agent Support
- Behavioral Baseline - Learn normal patterns per agent, flag anomalies with zero config
- Task-Scoped Permissions - Dynamically restrict tools based on inferred task (coding vs email)
- Shadow Execution - Run actions in isolation, auto-commit/discard based on behavior analysis
- Graceful Degradation - Automatically reduce agent autonomy when threats detected
Installation
Theron runs locally on your machine as a proxy between your AI agent and the LLM API.
git clone https://github.com/your-org/theron.git
cd theron
pip install -e .
Usage
# Start Theron (proxy + dashboard)
theron
# Opens:
# Proxy: http://localhost:8081
# Dashboard: http://localhost:8080
Point your AI agent at the proxy:
# For Anthropic-based agents
export ANTHROPIC_API_URL=http://localhost:8081
# For OpenAI-based agents
export OPENAI_API_BASE=http://localhost:8081/v1
# Run your agent normally - it's now protected
your-agent start
How It Works
┌──────────────────────────────────────────────────────────┐
│ YOUR COMPUTER │
│ │
│ ┌─────────┐ ┌─────────┐ ┌─────────────────┐ │
│ │ AI │ ───▶ │ THERON │ ───▶ │ api.anthropic. │ │
│ │ Agent │ ◀─── │ Proxy │ ◀─── │ com / openai │ │
│ └─────────┘ └─────────┘ └─────────────────┘ │
│ │ │
│ ┌────▼────┐ │
│ │Dashboard│ │
│ │ :8080 │ │
│ └─────────┘ │
└──────────────────────────────────────────────────────────┘
- Agent sends request to
localhost:8081 - Theron analyzes messages, tags trust levels, detects injection
- Request forwards to real LLM API
- Response analyzed - tool calls classified by risk
- Dangerous actions from untrusted content get blocked or sandboxed
- Dashboard shows real-time events and alerts
Policy Matrix
Actions are allowed/blocked based on source trust × action risk:
| Source ↓ / Risk → | Tier 1 (Safe) | Tier 2 (Moderate) | Tier 3 (Sensitive) | Tier 4 (Critical) |
|---|---|---|---|---|
| USER_DIRECT | Allow | Allow | Allow | Log |
| USER_INDIRECT | Allow | Allow | Log | Block |
| CONTENT_READ | Allow | Log | Sandbox | Block |
| TOOL_RESULT | Allow | Log | Sandbox | Block |
Testing
pytest tests/ -v # 127 tests
Configuration
Config stored at ~/.theron/config.yaml:
proxy:
listen_port: 8081
detection:
sensitivity: 5 # 1-10
injection_threshold: 70 # 0-100
classification:
unknown_tool_tier: 3 # Default tier for unknown tools
gating:
whitelist: [get_weather]
blacklist: [format_disk]
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file theron-0.1.0.tar.gz.
File metadata
- Download URL: theron-0.1.0.tar.gz
- Upload date:
- Size: 129.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
240eaa60696ca03e1eb3622624ba61ce37fb7bfff858cd08a3a8e9c5d67d556d
|
|
| MD5 |
8bf2d6382fa125a35ca38ed65b85a997
|
|
| BLAKE2b-256 |
59b0d6a104e870a7d98e990af10ce636062a5a6d286d5012525d721a18e0c3cc
|
Provenance
The following attestation bundles were made for theron-0.1.0.tar.gz:
Publisher:
publish.yml on Mukund2/theron
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
theron-0.1.0.tar.gz -
Subject digest:
240eaa60696ca03e1eb3622624ba61ce37fb7bfff858cd08a3a8e9c5d67d556d - Sigstore transparency entry: 869429302
- Sigstore integration time:
-
Permalink:
Mukund2/theron@c2bc851d800276fa2bf3cf0e27ce36f996f70be1 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/Mukund2
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@c2bc851d800276fa2bf3cf0e27ce36f996f70be1 -
Trigger Event:
release
-
Statement type:
File details
Details for the file theron-0.1.0-py3-none-any.whl.
File metadata
- Download URL: theron-0.1.0-py3-none-any.whl
- Upload date:
- Size: 112.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
aa35808109cfcac6268239b90b1e6023dfeaebcc578f79b372e5fea894ae20be
|
|
| MD5 |
2bcf91d407533f395f980fb2b0aebfc2
|
|
| BLAKE2b-256 |
489643736a2ddc259dd1038a51e86d5c72277cf37098bd6299fce57fd2949ae2
|
Provenance
The following attestation bundles were made for theron-0.1.0-py3-none-any.whl:
Publisher:
publish.yml on Mukund2/theron
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
theron-0.1.0-py3-none-any.whl -
Subject digest:
aa35808109cfcac6268239b90b1e6023dfeaebcc578f79b372e5fea894ae20be - Sigstore transparency entry: 869429316
- Sigstore integration time:
-
Permalink:
Mukund2/theron@c2bc851d800276fa2bf3cf0e27ce36f996f70be1 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/Mukund2
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@c2bc851d800276fa2bf3cf0e27ce36f996f70be1 -
Trigger Event:
release
-
Statement type: