guardian-runtime

Local-first runtime governance layer for AI systems

These details have not been verified by PyPI

Project links

Project description

Guardian Runtime

Local-first runtime governance for AI systems.
Your prompts never leave your machine. Your agents never go rogue.

 ┌─────────────────────────────────────────────────────────────────────┐
 │                    YOUR LOCAL INFRASTRUCTURE                         │
 │                                                                      │
 │  ┌────────────────────────────────────────┐                          │
 │  │         YOUR AI APPLICATION            │                          │
 │  │   (LangChain / CrewAI / Raw OpenAI)    │                          │
 │  └─────────────────┬──────────────────────┘                          │
 │                    │ prompt / tool call                              │
 │                    ▼                                                 │
 │  ┌─────────────────────────────────────────────────────────────┐     │
 │  │                  ⛨  GUARDIAN RUNTIME                        │     │
 │  │                                                             │     │
 │  │  ┌─────────────┐  ┌──────────────┐  ┌──────────────────┐   │     │
 │  │  │ INPUT GUARD │  │POLICY ENGINE │  │  TOOL GOVERNOR   │   │     │
 │  │  │ • PII scan  │  │ • YAML rules │  │ • Allowlists     │   │     │
 │  │  │ • Secrets   │  │ • Per-agent  │  │ • Rate limits    │   │     │
 │  │  │ • Jailbreak │  │ • Env-aware  │  │ • Arg validation │   │     │
 │  │  │ • Scope     │  │              │  │                  │   │     │
 │  │  └──────┬──────┘  └──────────────┘  └──────────────────┘   │     │
 │  │         │                                                   │     │
 │  │  ┌──────▼──────┐  ┌──────────────┐  ┌──────────────────┐   │     │
 │  │  │ COST ENGINE │  │ OUTPUT GUARD │  │  LOCAL LOGGER    │   │     │
 │  │  │ • Budgets   │  │ • Hallucin.  │  │ • JSONL to disk  │   │     │
 │  │  │ • Routing   │  │ • PII check  │  │ • Never uploaded │   │     │
 │  │  │ • Loop det. │  │ • Profanity  │  │ ~/.guardian/logs │   │     │
 │  │  └──────┬──────┘  └──────┬───────┘  └──────────────────┘   │     │
 │  └─────────┼────────────────┼─────────────────────────────────┘     │
 │            │                │                                        │
 └────────────┼────────────────┼────────────────────────────────────────┘
              │ response / block
              ▼
 ┌─────────────────────────────────┐        ┌────────────────────────────┐
 │      LLM (YOUR API KEY)         │        │  ⛨ GUARDIAN LICENSE SERVER │
 │  OpenAI / Anthropic / Gemini    │        │  Daily ping: key + count   │
 └─────────────────────────────────┘        │  Response:  valid / limit  │
                                            │  That's it. Nothing else.  │
                                            └────────────────────────────┘

The Problem

Every team shipping AI in production faces the same silent failures:

What Goes Wrong	How It Happens	Real Consequence
PII Leakage	User types their Aadhaar/SSN into a chat	DPDP Act fine up to ₹250 crore
Secret Exposure	Agent reads .env file or dev pastes API keys	OpenAI/AWS key compromised, must rotate immediately
Jailbreak	"Ignore all previous instructions…" bypasses your system prompt	Data exfiltration, brand damage
Hallucination	Agent cites facts that don't exist in your knowledge base	Wrong medical, legal, financial advice
Budget Blowout	Recursive agent loop runs GPT-4 for 3 minutes straight	$500 cloud bill, no warning
Tool Misuse	Agent calls `delete_records()` without authorization	Data loss, security incident

Existing tools only tell you after the damage is done. Langfuse, LangSmith, and Helicone are excellent observability platforms — but they are passive. They log what happened. They do not stop what is happening.

Worse, they require your prompts and responses to travel to their cloud. For a bank, a hospital, or a government agency, that is a non-starter.

Guardian Runtime is the missing enforcement layer. It intercepts every call before the model sees it and every response before the user sees it — entirely on your machine.

Local-First by Design

Guardian uses a Local-First, License Key model — the same model used by Cursor.

What stays on YOUR machine (forever):
  ✅  Every prompt you send
  ✅  Every LLM response you receive
  ✅  Every violation log entry
  ✅  Your OpenAI / Anthropic API key
  ✅  Your policy YAML
  ✅  The full Guardian Analysis Sheet

What our server receives (once per day, nothing else):
  → Your license key (to verify it's valid)
  → One integer: how many checks you ran this month
  ← Our response: valid/invalid + your plan limit

That is the complete list. No exceptions. Ever.

Why this matters for your users:

Developers and enterprise teams building AI products are worried about two things above all others:

Their users' sensitive data leaving their infrastructure
Their LLM API keys being exposed

Guardian solves both structurally, not by policy.

How the License Key Works

# Step 1: Sign up at guardian-ai.dev — get a free license key instantly
# Step 2: Install
pip install guardian-runtime

# Step 3: Initialize (stores key locally at ~/.guardian/config.json)
guardian init --key gdn_live_YOUR_KEY_HERE

# Step 4: Check your status
guardian status

License:  gdn_live_****...****a1b2
Plan:     Starter
Checks:   342 / 10,000 this month  [████████░░░░░░░░░░░░]  34%
Expiry:   2027-01-15
Status:   ACTIVE ✅

How enforcement works locally:

~/.guardian/
├── config.json     ← license key, plan, limit, expiry
├── usage.json      ← { "month": "2026-06", "checks": 342, "last_sync": "..." }
└── logs/
    └── 2026-06-01.jsonl   ← every check, every violation, on your disk

Checks are counted locally in usage.json — no cloud required
Once per day, the SDK pings guardian-ai.dev/api/validate with { key, count }
If our server is down: SDK continues working (fail-open)
Enterprise: offline license keys available — zero network requests, ever

Quickstart

1. Install & Initialize

pip install guardian-runtime
guardian init --key gdn_live_YOUR_KEY

2. Create a Policy File

# guardian_policy.yaml
version: "1.0"

agents:
  default:
    input_guard:
      pii_detection: true
      pii_entities: [aadhaar, pan, upi_id, credit_card, ssn, email, phone, secret]
      jailbreak_detection: true

    output_guard:
      hallucination_check: true
      pii_detection: true

    cost:
      daily_budget: 5.00
      per_session_limit: 0.50

3. Three Lines to Govern Any LLM Call

from guardian import Guardian

guardian = Guardian.from_policy("guardian_policy.yaml")

response = guardian.complete(
    model="gpt-4",
    messages=[{"role": "user", "content": user_input}],
    agent_id="support-bot",
)

print(response.content)
print(response.guardian_analysis)

What you get back:

┌──────────────────────────────────────────────────┐
│           GUARDIAN ANALYSIS SHEET                 │
├──────────────────────────────────────────────────┤
│ Input:                                           │
│   Tokens:    312   Est. Cost: $0.0187            │
│   PII:       None                                │
│   Secrets:   None                                │
│   Jailbreak: None                                │
│   Status:    ALLOWED ✅                          │
│                                                  │
│ Output:                                          │
│   Tokens:    89    Actual Cost: $0.0053          │
│   Hallucination: Low (grounded in context)       │
│   PII in response: None                          │
│   Status:    CLEAN ✅                            │
│                                                  │
│ Session:  $0.024 spent / $0.50 limit             │
└──────────────────────────────────────────────────┘

Core Features

🛡️ Input Guard — Block before the model sees it

# guardian_policy.yaml — input guard section
input_guard:
  pii_detection: true
  pii_entities:
    - aadhaar      # India DPDP Act — native detection
    - pan          # India DPDP Act
    - upi_id       # India DPDP Act (suffix-gated: user@ybl detected, user@gmail.com ignored)
    - credit_card  # Luhn-validated
    - ssn          # US Social Security Number
    - email
    - phone        # Indian (+91) and US formats
    - passport
    - secret       # API keys, tokens, credentials (OpenAI, AWS, GitHub, Stripe, Razorpay, etc.)
  pii_action: block         # "block" | "redact" | "flag"
  jailbreak_detection: true
  scope:
    allowed_topics: ["billing", "product", "support"]
    block_message: "I can only help with billing, product, and support questions."

Detects 50+ jailbreak patterns across 5 categories: DAN variants, instruction override, role-play injection, base64 encoding tricks, and system prompt extraction.

Detects 12+ API key formats at HIGH confidence (OpenAI sk-, AWS AKIA, GitHub ghp_, Stripe sk_live_, Razorpay rzp_live_, Groq gsk_, etc.) and generic KEY=value patterns at MEDIUM confidence.

🔍 Output Guard — Catch bad responses before users see them

# guardian_policy.yaml — output guard section
output_guard:
  hallucination_check: true
  hallucination_provider: openai  # "openai", "anthropic", "ollama", "gemini"
  hallucination_model: gpt-4o-mini  # Bring Your Own Model (BYOM)
  pii_detection: true
  profanity_filter: true
  competitor_block: ["CompetitorA", "CompetitorB"]

Hallucination detection uses the LLM-as-judge pattern. Instead of locking you into our infrastructure, Guardian uses a "Bring Your Own Model" (BYOM) architecture via LiteLLM. You can use your existing OpenAI key, Anthropic key, or point it to a 100% free, local Ollama instance (e.g., llama3). It compares the response against the provided context and returns grounded, partially_grounded, or hallucinated.

💰 AI FinOps — Control what you spend

cost:
  daily_budget: 10.00         # Hard daily ceiling per agent
  per_session_limit: 0.50     # Per-user session limit
  auto_downgrade:
    enabled: true
    threshold: 0.80           # At 80% daily budget, switch model
    target_model: gpt-3.5-turbo
  loop_detection:
    max_retries: 3
    similarity_threshold: 0.90  # Block repeated near-identical prompts
    action: block_and_alert

# Get a live cost report for any agent
report = guardian.get_cost_report(agent_id="support-bot")
# {
#   "today": "$3.42", "budget": "$10.00", "utilization": "34.2%",
#   "auto_downgrades": 5, "blocked_loops": 1,
#   "model_breakdown": {"gpt-4": "$2.99", "gpt-3.5-turbo": "$0.43"}
# }

🔧 Tool Governance — Control what agents can do

tools:
  allowed: [search_kb, get_order_status, create_ticket]
  denied:  [delete_user, execute_sql, send_bulk_email]
  rate_limits:
    create_ticket:
      max_calls: 3
      per: session
  argument_validation:
    create_ticket:
      priority:
        type: enum
        values: [low, medium, high]   # agent can't self-assign "critical"
      customer_id:
        type: string
        pattern: "^CUST-[0-9]{6}$"   # must match your ID format

📜 Policy Engine — Declarative YAML, no code changes needed

One file controls everything. Change the policy file. No deployment needed. Supports per-agent scoping, environment-aware rules (dev/staging/prod), and hot reload.

LangChain Integration

from langchain_openai import ChatOpenAI
from guardian.integrations.langchain import GuardianCallbackHandler

llm = ChatOpenAI(model="gpt-4")

# Drop Guardian in as a callback — zero changes to your chain
handler = GuardianCallbackHandler.from_policy("guardian_policy.yaml")

response = llm.invoke(
    "Summarize the refund policy",
    config={"callbacks": [handler]}
)
# Every LLM call in your chain is now governed.
# Raises GuardianBlockedError if PII or jailbreak detected.

Guardian Analysis Sheet — Full Example

Every guardian.complete() call returns this:

>>> response.guardian_analysis
GuardianAnalysisSheet(
    input=InputAnalysis(
        tokens=312,
        estimated_cost=0.0187,
        pii_found=[],                   # empty = clean
        secrets_found=[],               # empty = clean
        jailbreak_detected=False,
        scope_check="in_scope",
        optimization_hint=None,
        status="ALLOWED"
    ),
    output=OutputAnalysis(
        tokens=89,
        actual_cost=0.0053,
        hallucination_risk="grounded",
        pii_found=[],
        profanity_found=False,
        status="CLEAN"
    ),
    session_budget={
        "spent": 0.024,
        "limit": 0.50,
        "remaining": 0.476,
        "utilization_pct": 4.8
    },
    agent_id="support-bot",
    timestamp=datetime(2026, 6, 1, 10, 30, 0)
)

CLI Reference

guardian init --key gdn_live_YOUR_KEY   # Store license key locally
guardian status                          # Show plan, usage, expiry
guardian validate policy.yaml           # Check YAML for errors
guardian logs                           # View local violation logs
guardian logs --tail 20 --severity high # Filter logs

Pricing

Plan	Price	Checks/month	License	Offline
Free	$0	500	Key required	❌
Starter	$10/mo	10,000	Key unlocks limit	❌
Pro	$30/mo	100,000	Key unlocks limit	❌
Enterprise	Custom	Unlimited	Dedicated key	✅

How limits work:

Counted locally in ~/.guardian/usage.json — no cloud required to enforce
Soft warning at 80% (printed in analysis sheet)
Soft block at 100% (returns upgrade prompt, never crashes your app)
Grace period: 7 days after key expiry before hard block — you're never left stranded mid-deployment
Enterprise: fully offline keys, no daily ping, air-gapped deployments supported

Annual plans: 2 months free. Overage: $0.005 per 1,000 extra checks.

Comparison

Feature	Guardian	Langfuse	LangSmith	Helicone	Guardrails AI
Prompts stay local	✅	❌	❌	❌	✅
Runtime blocking	✅	❌	❌	❌	✅
PII detection — input	✅	❌	❌	❌	✅
PII detection — output	✅	❌	❌	❌	✅
Secret/API key detection	✅	❌	❌	❌	❌
India DPDP Act (Aadhaar/PAN/UPI)	✅	❌	❌	❌	❌
Jailbreak detection	✅	❌	❌	❌	✅
Hallucination detection	✅	❌	❌	❌	✅
Token budgets	✅	❌	❌	✅	❌
Auto model routing	✅	❌	❌	❌	❌
Loop detection	✅	❌	❌	❌	❌
Tool governance	✅	❌	❌	❌	❌
YAML policy engine	✅	❌	❌	❌	✅
Per-agent scoping	✅	❌	❌	❌	❌
Open source	✅	✅	❌	❌	✅
Self-hosted	✅	✅	❌	❌	✅
Traces & observability	Basic	✅	✅	✅	❌

Global Compliance

Guardian enforces compliance rules without any data leaving your machine:

Regulation	Jurisdiction	Guardian Support
India DPDP Act 2023	India	Native Aadhaar, PAN, UPI detection; consent tracking
GDPR	EU	PII blocking, data minimization, audit-ready local logs
HIPAA	US Healthcare	PHI detection, access logging, zero cloud upload
CCPA	California	PII detection, opt-out support
EU AI Act	EU	Risk logging, transparency records

Indian teams: Guardian was built in Pune, India. DPDP Act support is not bolted on — it's a first-class feature.

SDK Folder Structure

guardian/
├── core/
│   ├── engine.py          # Main orchestration (the guardian.complete() flow)
│   ├── policy.py          # YAML policy loader + Pydantic schema validation
│   ├── analysis.py        # GuardianAnalysisSheet builder
│   ├── license.py         # License key validator + daily sync (fail-open)
│   └── storage.py         # ~/.guardian/ local file manager
│
├── guards/
│   ├── input_guard.py     # Orchestrates all input checks
│   ├── output_guard.py    # Orchestrates all output checks
│   └── validators/
│       ├── pii.py         # PII regex detection + secret/credential scanning
│       ├── jailbreak.py   # 50+ pattern jailbreak detector
│       ├── hallucination.py  # LLM-as-judge checker
│       └── profanity.py   # Keyword-based content filter
│
├── finops/
│   ├── token_counter.py   # tiktoken wrapper
│   ├── cost_calculator.py # Model pricing tables (all major providers)
│   ├── budget_manager.py  # Per-agent budget enforcement
│   ├── loop_detector.py   # Recursive loop detection (difflib)
│   └── router.py          # Auto model downgrade logic
│
├── tools/
│   ├── tool_governor.py   # Allowlist/denylist + rate limiting
│   └── arg_validator.py   # Tool argument validation (regex/enum/type)
│
├── logging/
│   ├── logger.py          # Dispatches to configured sinks
│   └── sinks/
│       ├── jsonl.py       # ~/.guardian/logs/YYYY-MM-DD.jsonl
│       └── console.py     # Terminal pretty-print (ANSI colors)
│
├── integrations/
│   ├── openai_wrapper.py  # Wraps openai.chat.completions.create()
│   └── langchain.py       # LangChain BaseCallbackHandler
│
└── cli/
    ├── main.py            # CLI entry point (click group)
    ├── init.py            # `guardian init --key`
    ├── validate.py        # `guardian validate policy.yaml`
    ├── status.py          # `guardian status`
    └── logs.py            # `guardian logs`

Roadmap

Version	Date	What ships
v0.1.0	June 2026	PII detection (India DPDP + Global), Secret/API key detection, Jailbreak detection, Hallucination check, Cost tracking, YAML policies, OpenAI + LangChain integration, local JSONL logging
v0.2.0	August 2026	Tool governance, Auto model routing, Loop detection, CrewAI integration, Slack alerts
v0.3.0	October 2026	Developer portal (Next.js), License key server, Razorpay billing
v0.4.0	December 2026	Gemini + Autogen, Kubernetes operator, Enterprise offline keys

Why Open Source

Guardian Runtime is Apache-2.0 licensed because:

A governance tool must be auditable. If Guardian controls what your AI can do, you need to see the code. Black-box governance is a contradiction.
Compliance requires proof. Regulators need to verify that controls exist and work. Open source is the only acceptable proof.
PII patterns and jailbreak techniques evolve fast. A community catches new attacks faster than any single team.
No vendor lock-in. Works with OpenAI, Anthropic, Gemini, LangChain, CrewAI. Switch providers without touching your governance layer.

Contributing

git clone https://github.com/guardian-ai/guardian-runtime.git
cd guardian-runtime
python -m venv venv && source venv/bin/activate
pip install -e ".[dev]"
pytest tests/ -v       # run tests
ruff check guardian/   # lint

Good first contributions:

Area	Examples
PII patterns	Add new country-specific ID formats
Jailbreak corpus	Add newly discovered jailbreak patterns
Cost tables	Update model pricing as providers change
Integrations	Add Autogen, Haystack, LlamaIndex support
Documentation	Improve examples, fix typos

Contribution rules: Every validator must be testable in isolation (no LLM calls for unit tests). Every policy feature must be YAML-configurable. Every violation must be logged with full context.

Security

Found a vulnerability? Email security@guardian-ai.dev. Do not open a public GitHub issue for security disclosures.

See SECURITY.md for the full responsible disclosure policy.

Built with conviction in Pune, India 🇮🇳
guardian-ai.dev · GitHub · Discord · Twitter

_{Apache-2.0 · If Guardian prevents one PII leak, one jailbreak, or one $500 API bill — it was worth building.}

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

1.1.3

Jun 6, 2026

1.1.2

Jun 6, 2026

1.1.1

Jun 6, 2026

1.1.0

Jun 6, 2026

1.0.11

Jun 5, 2026

1.0.10

Jun 5, 2026

1.0.9

Jun 5, 2026

1.0.8

Jun 5, 2026

1.0.7

Jun 5, 2026

1.0.6

Jun 5, 2026

1.0.5

Jun 5, 2026

1.0.3

Jun 4, 2026

1.0.2

Jun 4, 2026

1.0.1

Jun 4, 2026

1.0.0

Jun 4, 2026

0.2.4

Jun 4, 2026

0.2.3

Jun 4, 2026

0.2.2

Jun 4, 2026

0.2.1

Jun 2, 2026

0.2.0

Jun 2, 2026

0.1.1

Jun 1, 2026

This version

0.1.0

Jun 1, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

guardian_runtime-0.1.0.tar.gz (63.6 kB view details)

Uploaded Jun 1, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

guardian_runtime-0.1.0-py3-none-any.whl (24.4 kB view details)

Uploaded Jun 1, 2026 Python 3

File details

Details for the file guardian_runtime-0.1.0.tar.gz.

File metadata

Download URL: guardian_runtime-0.1.0.tar.gz
Upload date: Jun 1, 2026
Size: 63.6 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for guardian_runtime-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`571d236fe8ff3cc42faecb5795ce6ac79ec2374e6e43385cc7045e775d3e7c46`
MD5	`60d4f49a88359f9c611b4035e67b4d61`
BLAKE2b-256	`4cf1ec1e7f4a91a9e64059735e3019e8cd3cecd513fbd30294cf3969a935fca9`

See more details on using hashes here.

File details

Details for the file guardian_runtime-0.1.0-py3-none-any.whl.

File metadata

Download URL: guardian_runtime-0.1.0-py3-none-any.whl
Upload date: Jun 1, 2026
Size: 24.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for guardian_runtime-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`e8a353a07411e317573b4d90af60f129c6cfbe1ac6f0b12857a81d70c644e28a`
MD5	`a49325b117eba9be36d4c6d62a077861`
BLAKE2b-256	`994fec674b5926e65603e240fbc9d1c945e5c1546a96d29df585b82a9bd59b32`

See more details on using hashes here.

guardian-runtime 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Guardian Runtime

The Problem

Local-First by Design

How the License Key Works

Quickstart

1. Install & Initialize

2. Create a Policy File

3. Three Lines to Govern Any LLM Call

Core Features

🛡️ Input Guard — Block before the model sees it

🔍 Output Guard — Catch bad responses before users see them

💰 AI FinOps — Control what you spend

🔧 Tool Governance — Control what agents can do

📜 Policy Engine — Declarative YAML, no code changes needed

LangChain Integration

Guardian Analysis Sheet — Full Example

CLI Reference

Pricing

Comparison

Global Compliance

SDK Folder Structure

Roadmap

Why Open Source

Contributing

Security

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes