Local-first runtime governance layer for AI systems
Project description
Guardian Runtime
Open-source, local-first AI governance & cost optimization.
Guardian Runtime is a Python SDK that sits between your AI application and any LLM — intercepting every prompt and response to block data leaks, prevent jailbreaks, and automatically reduce your token costs by up to 40%. Everything runs locally. Your data never leaves your infrastructure.
User Input → [Input Optimizer] → [Input Guard] → LLM → [Output Guard] → User
↓ ↓ ↓
Saves tokens Blocks PII/secrets Blocks output PII
Why Guardian?
| Problem | How Guardian Solves It |
|---|---|
| PII leaks to LLM providers | Local NER scanning blocks SSNs, Aadhaar, API keys before the prompt leaves your server |
| Exploding AI costs | Input Optimizer compresses prompts, converts PDFs to markdown, trims chat history — saving 30-70% tokens |
| No runtime controls | YAML policy engine enforces per-agent rules without code changes |
| Jailbreak attacks | 40+ pattern detection blocks prompt injection attempts |
| Compliance burden | Built for GDPR, HIPAA, CCPA, and India DPDP out of the box |
Existing tools (Langfuse, Helicone, LangSmith) only observe traffic. Guardian actively prevents bad behavior at the moment it happens.
Install
pip install guardian-runtime
Requires Python 3.9+
Quickstart
3 Lines to Governed AI
from guardian import Guardian
guardian = Guardian.from_policy("policies/minimal.yaml")
response = guardian.complete(
model="gpt-4o",
messages=[{"role": "user", "content": user_input}],
)
if response.blocked:
print(f"Blocked: {response.violations[0].type}")
else:
print(response.content)
if response.optimization:
print(f"Tokens saved: {response.optimization['savings_pct']:.0%}")
Scan Without an LLM Key
from guardian import scan_pii, scan_secrets
result = scan_pii("My Aadhaar is 1234 5678 9012")
print(result.blocked) # True
result = scan_secrets("My key is sk-proj-xxxxxxxxxxxxxxxxxxxx")
print(result.blocked) # True
Optimize Prompts Standalone
from guardian import optimize_input, convert_document
# Compress a messy conversation
result = optimize_input(messages, model="gpt-4o")
print(f"Saved {result.savings_pct:.0%} of tokens")
# Convert a heavy PDF to token-efficient markdown
doc = convert_document("contract.pdf") # requires: pip install guardian-runtime[optimizer]
print(f"{doc.markdown_tokens} tokens (was {doc.original_size_bytes} bytes)")
Features
🛡️ Security & Privacy
- PII Detection — SSN, credit cards, email, phone, passport, Aadhaar, PAN, UPI
- Secret Detection — OpenAI, Anthropic, AWS, GitHub, Stripe, Razorpay, Groq keys
- Jailbreak Detection — 40+ patterns (DAN, ignore instructions, role-play attacks)
- Output Guard — scans LLM responses for leaked PII before reaching the user
- Action modes —
block,redact, orflagper entity type
⚡ Cost Optimization (Input Optimizer)
- Prompt compression — strips whitespace, deduplicates system prompts, removes empty messages
- History trimming — keeps last N turns, always preserves system prompt
- Document conversion — PDF/DOCX/XLSX → clean markdown via Microsoft MarkItDown (40-70% token savings)
- Token budget enforcement — warn or block when input exceeds limits
- Proactive guidance — logs suggestions when bloated prompts are detected
- Savings tracking — every
GuardianResponseincludes optimization metadata
🔧 Governance Engine
- YAML policies — define rules per agent, no code changes needed
- Multi-agent — different rules for different bots (HR-Bot vs Support-Bot)
- Multi-provider — OpenAI and Google Gemini supported (Anthropic coming soon)
- Local JSONL logs — full audit trail at
~/.guardian/logs/ - CLI —
guardian init,validate,status,logs - FinOps — token counting, cost estimation, per-session spend tracking
🔒 100% Local-First
- All governance runs on your infrastructure
- No prompts sent to Guardian servers — ever
- One daily sync sends only: license key + check count (number only)
- Built for regulated industries: finance, healthcare, government
Policy Example
version: "1.0"
name: "production"
agents:
default:
llm:
provider: openai
default_model: gpt-4o-mini
input_guard:
pii_detection: true
jailbreak_detection: true
output_guard:
pii_detection: true
optimizer:
enabled: true
whitespace_normalization: true
max_history_messages: 20
deduplicate_system_prompts: true
cost:
max_input_tokens: 8000
per_session_limit: 1.00
Compliance
Guardian's PII detection covers real regulatory requirements:
- GDPR (EU) — email, phone, passport, general PII
- HIPAA (US health) — sensitive personal data blocking
- CCPA (California) — consumer data protection
- DPDP Act 2023 (India) — Aadhaar, PAN, UPI + general PII
- SOC2 / Enterprise — local-only processing, no prompt upload to vendor cloud
⚠️ Guardian is an assistive compliance tool, not legal advice. Always consult qualified counsel.
CLI
guardian init --key gdn_free_xxxxx # Setup (optional)
guardian validate policies/minimal.yaml # Check policy syntax
guardian status # View usage stats
guardian logs --tail 10 # View recent violation logs
Architecture
guardian/
├── core/ # Engine, policy, models, storage, license
├── guards/ # Input & Output guards
│ └── validators/ # PII, secrets, jailbreak detectors
├── optimizer/ # Input Optimizer, Document Converter (MarkItDown)
├── finops/ # Token counter, cost calculator
├── providers/ # OpenAI, Gemini (Anthropic coming)
├── logging/ # Local JSONL logger
└── cli/ # CLI commands
See ARCHITECTURE.md for the full technical specification.
Development
pip install guardian-runtime[dev]
pytest tests/ # 106 tests (integration tests mock LLM providers)
Roadmap
| Version | Target | Highlights |
|---|---|---|
| 0.1.1 | ✅ Jun 1 | PyPI, PII/secrets, policy schema |
| 0.2.0 | ✅ Jun 2 | Full engine, guards, CLI, logs, Input Optimizer |
| 1.0.0 | Jun 30 | Anthropic provider, launch polish, v1.0 PyPI |
| 1.1.0 | Jul 2026 | LangChain callback, hallucination judge, budget hard-stop |
| 2.0.0 | Sep 2026 | Developer portal, license server, paid tiers |
Project Docs
- V1_LAUNCH_PLAN.md — June sprint to 1.0.0
- PLAN.md — 16-week product & monetization plan
- ARCHITECTURE.md — technical design
- CHANGELOG.md — release history
License
Apache-2.0
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file guardian_runtime-0.2.0.tar.gz.
File metadata
- Download URL: guardian_runtime-0.2.0.tar.gz
- Upload date:
- Size: 85.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b901c733b3e7047d9c5a62bbefaf0acffa5374efab4f29e0aafae92463258561
|
|
| MD5 |
6dd6302f70a0cb973382d41f93583b18
|
|
| BLAKE2b-256 |
358ea32971966c96c550a513107c765b7d557a293011033b0fb502b3e4835375
|
File details
Details for the file guardian_runtime-0.2.0-py3-none-any.whl.
File metadata
- Download URL: guardian_runtime-0.2.0-py3-none-any.whl
- Upload date:
- Size: 37.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
eab3857a4ceb7db0e3de6f7b4822bf5a27660bd29463fcaa907cf6d7df971a07
|
|
| MD5 |
bd115c8a05b3ad43a76009b034c7c123
|
|
| BLAKE2b-256 |
5b871213df7ca24356e08b25d492ebffab39fd519bce94c4214dd42a49faebed
|