Local-first runtime governance layer for AI systems
Project description
Guardian Runtime
Cut LLM token costs. Block data leaks. Ship safer AI.
Guardian Runtime is a local-first Python SDK that sits between your AI application and any LLM. It automatically compresses prompts to reduce token costs, blocks PII and API key leaks before they reach the model, and catches jailbreak attempts — all on your machine. Your prompts never leave your infrastructure.
User Input
↓
[Input Optimizer] → saves 30–70% tokens
↓
[Input Guard] → blocks PII, secrets, jailbreaks
↓
LLM → your model, your API key
↓
[Output Guard] → scans response before it reaches user
↓
Safe Response + Analysis Report
Why Guardian?
Observability tools (Langfuse, LangSmith, Helicone) log what went wrong — after it happened. Guardian stops it before it happens, on your machine.
| Observability Tools | Guardian Runtime | |
|---|---|---|
| When it acts | After the LLM call | Before and after |
| Your data path | Sent to their cloud | Stays on your machine |
| PII in prompt | Logged | Blocked |
| Exposed API keys | Not detected | Blocked |
| Token costs | Tracked | Actively reduced |
| Jailbreak attempts | Logged | Blocked |
Install
pip install guardian-runtime
Optional extras:
pip install guardian-runtime[optimizer] # PDF/DOCX → markdown conversion
pip install guardian-runtime[dev] # testing tools
Requires Python 3.9+
Supported Providers
| Provider | Environment Variable | Policy file | Default Model |
|---|---|---|---|
| Google Gemini | GEMINI_API_KEY |
policies/gemini.yaml |
gemini-2.0-flash |
| Anthropic Claude | ANTHROPIC_API_KEY |
policies/anthropic.yaml |
claude-3-5-haiku-latest |
| OpenAI | OPENAI_API_KEY |
policies/minimal.yaml |
gpt-4o-mini |
Override provider at runtime:
response = guardian.complete(
provider="anthropic",
model="claude-3-5-haiku-latest",
messages=[...]
)
Features
Security & Privacy
- PII Detection — Aadhaar (
xxxx xxxx xxxx), PAN (XXXXX0000X), UPI (name@bank), SSN (xxx-xx-xxxx), credit cards, email, phone, passport numbers - Secret Detection — OpenAI keys (
sk-...), AWS access keys (AKIA...), GitHub tokens (ghp_...), Stripe keys (sk_live_...), Razorpay keys (rzp_live_...), Groq keys (gsk_...), generic.envpatterns - Jailbreak Detection — 40+ patterns covering DAN variants, instruction overrides, role-play injections, encoding tricks, and system prompt extraction attempts
- Output Guard — scans LLM responses for PII and secrets before they reach users
- Action modes —
block,redact, orflagper entity type
Cost Optimization
- Prompt compression — strips redundant whitespace, deduplicates system prompts, removes empty messages
- History trimming — keeps last N turns, always preserves system prompt
- Document conversion — PDF, DOCX, XLSX → clean markdown (30–70% token savings)
- Token budget enforcement — warn or block when input exceeds your defined limit
- Cost estimation — every response includes token count and USD cost estimate
Governance
- YAML policy engine — define rules per agent, no code changes needed to update policies
- Multi-agent support — different rules for different bots
- Block / redact / flag modes — choose the right action per violation type
- Local audit logs — full JSONL log at
~/.guardian/logs/— never uploaded anywhere
CLI
guardian init --key gdn_free_xxxxx # optional license setup
guardian validate policies/gemini.yaml # check policy syntax
guardian status # view usage this month
guardian logs --tail 20 # view recent violations
Policy Example
version: "1.0"
agents:
default:
llm:
provider: gemini
default_model: gemini-2.0-flash
input_guard:
pii_detection: true
jailbreak_detection: true
pii_action: block
output_guard:
pii_detection: true
optimizer:
enabled: true
whitespace_normalization: true
max_history_messages: 10
deduplicate_system_prompts: true
cost:
max_input_tokens: 8000
Validate before use:
guardian validate policies/gemini.yaml
Analysis Report
Every guardian.complete() call returns a full analysis:
response = guardian.complete(messages=[...])
print(response.content) # safe, validated response
print(response.blocked) # True if blocked
print(response.violations) # list of what was caught
print(response.input_tokens) # tokens used
print(response.estimated_cost_usd) # cost in USD
print(response.optimization) # tokens saved, savings %
Example output:
{
"blocked": False,
"input_tokens": 620,
"estimated_cost_usd": 0.0004,
"optimization": {
"original_tokens": 1840,
"optimized_tokens": 620,
"savings_pct": 0.66,
"actions_taken": ["whitespace_normalization", "history_trimming"]
},
"violations": []
}
Compliance
Guardian's detection covers major data protection regulations:
| Regulation | Coverage |
|---|---|
| India DPDP Act 2023 | Aadhaar, PAN, UPI — native patterns |
| GDPR (EU) | Email, phone, passport, general PII |
| HIPAA (US Health) | Sensitive personal data blocking |
| CCPA (California) | Consumer data protection |
Guardian is an assistive compliance tool, not legal advice. Always consult qualified counsel for regulatory requirements.
Local-First Architecture
Your Machine
├── guardian SDK ← all processing happens here
├── ~/.guardian/
│ ├── config.json ← license key (if using paid plan)
│ ├── usage.json ← monthly check count
│ └── logs/ ← violation logs (never uploaded)
│ └── YYYY-MM-DD.jsonl
What Guardian's servers never receive:
- Your prompts
- Your LLM responses
- Your violation details
- Your API keys (OpenAI, Anthropic, etc.)
What the optional daily sync sends (once per day, HTTPS only):
- Your hashed license key
- A single number: how many checks you ran
Project Structure
guardian/
├── core/ engine, policy, models, storage
├── guards/ input guard, output guard, validators
│ └── validators/ pii, secrets, jailbreak, hallucination
├── optimizer/ prompt compression, document converter
├── providers/ openai, gemini, anthropic
├── finops/ token counter, cost calculator
├── logging/ local JSONL logger
└── cli/ init, validate, status, logs
Development
git clone https://github.com/ashp15205/guardian-runtime.git
cd guardian-runtime
pip install -e ".[dev,optimizer]"
pytest tests/ -q # 111 tests
See ARCHITECTURE.md for the full technical specification.
Privacy
- All scanning runs on your infrastructure
- Logs stored locally at
~/.guardian/logs/ - Optional license sync sends only a hashed key + check count — never prompts or responses
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file guardian_runtime-0.2.2.tar.gz.
File metadata
- Download URL: guardian_runtime-0.2.2.tar.gz
- Upload date:
- Size: 88.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
929484487d3ccb68f3cc5bf20a583850b3ef0ef4f5a1d14eba5afec60544d083
|
|
| MD5 |
3266f80c74dbc55fcaa1f28b4c362217
|
|
| BLAKE2b-256 |
ad6b747a35e3baa01a73f2e278f42cbdf419256d72e9aa11661756b5fbac0e0c
|
File details
Details for the file guardian_runtime-0.2.2-py3-none-any.whl.
File metadata
- Download URL: guardian_runtime-0.2.2-py3-none-any.whl
- Upload date:
- Size: 49.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c3c413b8b3310ab834d48d6db476223af409344abe9f48ffc75d368176017253
|
|
| MD5 |
35cec4fac7eefb20e1b910a0f0f0c0ec
|
|
| BLAKE2b-256 |
558ace89d1da9525deda60653e04a70e021ae8de64e253f2ae26dfd9b5b397ef
|