Skip to main content

Local-first runtime governance layer for AI systems

Project description

Guardian Runtime

Guardian Runtime

Local-first runtime governance for AI systems.
Your prompts never leave your machine. Your agents never go rogue.

PyPI License CI Python Discord Docs


 ┌─────────────────────────────────────────────────────────────────────┐
 │                    YOUR LOCAL INFRASTRUCTURE                         │
 │                                                                      │
 │  ┌────────────────────────────────────────┐                          │
 │  │         YOUR AI APPLICATION            │                          │
 │  │   (LangChain / CrewAI / Raw OpenAI)    │                          │
 │  └─────────────────┬──────────────────────┘                          │
 │                    │ prompt / tool call                              │
 │                    ▼                                                 │
 │  ┌─────────────────────────────────────────────────────────────┐     │
 │  │                  ⛨  GUARDIAN RUNTIME                        │     │
 │  │                                                             │     │
 │  │  ┌─────────────┐  ┌──────────────┐  ┌──────────────────┐   │     │
 │  │  │ INPUT GUARD │  │POLICY ENGINE │  │  TOOL GOVERNOR   │   │     │
 │  │  │ • PII scan  │  │ • YAML rules │  │ • Allowlists     │   │     │
 │  │  │ • Secrets   │  │ • Per-agent  │  │ • Rate limits    │   │     │
 │  │  │ • Jailbreak │  │ • Env-aware  │  │ • Arg validation │   │     │
 │  │  │ • Scope     │  │              │  │                  │   │     │
 │  │  └──────┬──────┘  └──────────────┘  └──────────────────┘   │     │
 │  │         │                                                   │     │
 │  │  ┌──────▼──────┐  ┌──────────────┐  ┌──────────────────┐   │     │
 │  │  │ COST ENGINE │  │ OUTPUT GUARD │  │  LOCAL LOGGER    │   │     │
 │  │  │ • Budgets   │  │ • Hallucin.  │  │ • JSONL to disk  │   │     │
 │  │  │ • Routing   │  │ • PII check  │  │ • Never uploaded │   │     │
 │  │  │ • Loop det. │  │ • Profanity  │  │ ~/.guardian/logs │   │     │
 │  │  └──────┬──────┘  └──────┬───────┘  └──────────────────┘   │     │
 │  └─────────┼────────────────┼─────────────────────────────────┘     │
 │            │                │                                        │
 └────────────┼────────────────┼────────────────────────────────────────┘
              │ response / block
              ▼
 ┌─────────────────────────────────┐        ┌────────────────────────────┐
 │      LLM (YOUR API KEY)         │        │  ⛨ GUARDIAN LICENSE SERVER │
 │  OpenAI / Anthropic / Gemini    │        │  Daily ping: key + count   │
 └─────────────────────────────────┘        │  Response:  valid / limit  │
                                            │  That's it. Nothing else.  │
                                            └────────────────────────────┘

The Problem

Every team shipping AI in production faces the same silent failures:

What Goes Wrong How It Happens Real Consequence
PII Leakage User types their Aadhaar/SSN into a chat DPDP Act fine up to ₹250 crore
Secret Exposure Agent reads .env file or dev pastes API keys OpenAI/AWS key compromised, must rotate immediately
Jailbreak "Ignore all previous instructions…" bypasses your system prompt Data exfiltration, brand damage
Hallucination Agent cites facts that don't exist in your knowledge base Wrong medical, legal, financial advice
Budget Blowout Recursive agent loop runs GPT-4 for 3 minutes straight $500 cloud bill, no warning
Tool Misuse Agent calls delete_records() without authorization Data loss, security incident

Existing tools only tell you after the damage is done. Langfuse, LangSmith, and Helicone are excellent observability platforms — but they are passive. They log what happened. They do not stop what is happening.

Worse, they require your prompts and responses to travel to their cloud. For a bank, a hospital, or a government agency, that is a non-starter.

Guardian Runtime is the missing enforcement layer. It intercepts every call before the model sees it and every response before the user sees it — entirely on your machine.


Local-First by Design

Guardian uses a Local-First, License Key model — the same model used by Cursor.

What stays on YOUR machine (forever):
  ✅  Every prompt you send
  ✅  Every LLM response you receive
  ✅  Every violation log entry
  ✅  Your OpenAI / Anthropic API key
  ✅  Your policy YAML
  ✅  The full Guardian Analysis Sheet

What our server receives (once per day, nothing else):
  → Your license key (to verify it's valid)
  → One integer: how many checks you ran this month
  ← Our response: valid/invalid + your plan limit

That is the complete list. No exceptions. Ever.

Why this matters for your users:

Developers and enterprise teams building AI products are worried about two things above all others:

  1. Their users' sensitive data leaving their infrastructure
  2. Their LLM API keys being exposed

Guardian solves both structurally, not by policy.


How the License Key Works

# Step 1: Sign up at guardian-ai.dev — get a free license key instantly
# Step 2: Install
pip install guardian-runtime

# Step 3: Initialize (stores key locally at ~/.guardian/config.json)
guardian init --key gdn_live_YOUR_KEY_HERE

# Step 4: Check your status
guardian status
License:  gdn_live_****...****a1b2
Plan:     Starter
Checks:   342 / 10,000 this month  [████████░░░░░░░░░░░░]  34%
Expiry:   2027-01-15
Status:   ACTIVE ✅

How enforcement works locally:

~/.guardian/
├── config.json     ← license key, plan, limit, expiry
├── usage.json      ← { "month": "2026-06", "checks": 342, "last_sync": "..." }
└── logs/
    └── 2026-06-01.jsonl   ← every check, every violation, on your disk
  • Checks are counted locally in usage.json — no cloud required
  • Once per day, the SDK pings guardian-ai.dev/api/validate with { key, count }
  • If our server is down: SDK continues working (fail-open)
  • Enterprise: offline license keys available — zero network requests, ever

Quickstart

1. Install & Initialize

pip install guardian-runtime
guardian init --key gdn_live_YOUR_KEY

2. Create a Policy File

# guardian_policy.yaml
version: "1.0"

agents:
  default:
    input_guard:
      pii_detection: true
      pii_entities: [aadhaar, pan, upi_id, credit_card, ssn, email, phone, secret]
      jailbreak_detection: true

    output_guard:
      hallucination_check: true
      pii_detection: true

    cost:
      daily_budget: 5.00
      per_session_limit: 0.50

3. Three Lines to Govern Any LLM Call

from guardian import Guardian

guardian = Guardian.from_policy("guardian_policy.yaml")

response = guardian.complete(
    model="gpt-4",
    messages=[{"role": "user", "content": user_input}],
    agent_id="support-bot",
)

print(response.content)
print(response.guardian_analysis)

What you get back:

┌──────────────────────────────────────────────────┐
│           GUARDIAN ANALYSIS SHEET                 │
├──────────────────────────────────────────────────┤
│ Input:                                           │
│   Tokens:    312   Est. Cost: $0.0187            │
│   PII:       None                                │
│   Secrets:   None                                │
│   Jailbreak: None                                │
│   Status:    ALLOWED ✅                          │
│                                                  │
│ Output:                                          │
│   Tokens:    89    Actual Cost: $0.0053          │
│   Hallucination: Low (grounded in context)       │
│   PII in response: None                          │
│   Status:    CLEAN ✅                            │
│                                                  │
│ Session:  $0.024 spent / $0.50 limit             │
└──────────────────────────────────────────────────┘

Core Features

🛡️ Input Guard — Block before the model sees it

# guardian_policy.yaml — input guard section
input_guard:
  pii_detection: true
  pii_entities:
    - aadhaar      # India DPDP Act — native detection
    - pan          # India DPDP Act
    - upi_id       # India DPDP Act (suffix-gated: user@ybl detected, user@gmail.com ignored)
    - credit_card  # Luhn-validated
    - ssn          # US Social Security Number
    - email
    - phone        # Indian (+91) and US formats
    - passport
    - secret       # API keys, tokens, credentials (OpenAI, AWS, GitHub, Stripe, Razorpay, etc.)
  pii_action: block         # "block" | "redact" | "flag"
  jailbreak_detection: true
  scope:
    allowed_topics: ["billing", "product", "support"]
    block_message: "I can only help with billing, product, and support questions."

Detects 50+ jailbreak patterns across 5 categories: DAN variants, instruction override, role-play injection, base64 encoding tricks, and system prompt extraction.

Detects 12+ API key formats at HIGH confidence (OpenAI sk-, AWS AKIA, GitHub ghp_, Stripe sk_live_, Razorpay rzp_live_, Groq gsk_, etc.) and generic KEY=value patterns at MEDIUM confidence.


🔍 Output Guard — Catch bad responses before users see them

# guardian_policy.yaml — output guard section
output_guard:
  hallucination_check: true
  hallucination_provider: openai  # "openai", "anthropic", "ollama", "gemini"
  hallucination_model: gpt-4o-mini  # Bring Your Own Model (BYOM)
  pii_detection: true
  profanity_filter: true
  competitor_block: ["CompetitorA", "CompetitorB"]

Hallucination detection uses the LLM-as-judge pattern. Instead of locking you into our infrastructure, Guardian uses a "Bring Your Own Model" (BYOM) architecture via LiteLLM. You can use your existing OpenAI key, Anthropic key, or point it to a 100% free, local Ollama instance (e.g., llama3). It compares the response against the provided context and returns grounded, partially_grounded, or hallucinated.


💰 AI FinOps — Control what you spend

cost:
  daily_budget: 10.00         # Hard daily ceiling per agent
  per_session_limit: 0.50     # Per-user session limit
  auto_downgrade:
    enabled: true
    threshold: 0.80           # At 80% daily budget, switch model
    target_model: gpt-3.5-turbo
  loop_detection:
    max_retries: 3
    similarity_threshold: 0.90  # Block repeated near-identical prompts
    action: block_and_alert
# Get a live cost report for any agent
report = guardian.get_cost_report(agent_id="support-bot")
# {
#   "today": "$3.42", "budget": "$10.00", "utilization": "34.2%",
#   "auto_downgrades": 5, "blocked_loops": 1,
#   "model_breakdown": {"gpt-4": "$2.99", "gpt-3.5-turbo": "$0.43"}
# }

🔧 Tool Governance — Control what agents can do

tools:
  allowed: [search_kb, get_order_status, create_ticket]
  denied:  [delete_user, execute_sql, send_bulk_email]
  rate_limits:
    create_ticket:
      max_calls: 3
      per: session
  argument_validation:
    create_ticket:
      priority:
        type: enum
        values: [low, medium, high]   # agent can't self-assign "critical"
      customer_id:
        type: string
        pattern: "^CUST-[0-9]{6}$"   # must match your ID format

📜 Policy Engine — Declarative YAML, no code changes needed

One file controls everything. Change the policy file. No deployment needed. Supports per-agent scoping, environment-aware rules (dev/staging/prod), and hot reload.


LangChain Integration

from langchain_openai import ChatOpenAI
from guardian.integrations.langchain import GuardianCallbackHandler

llm = ChatOpenAI(model="gpt-4")

# Drop Guardian in as a callback — zero changes to your chain
handler = GuardianCallbackHandler.from_policy("guardian_policy.yaml")

response = llm.invoke(
    "Summarize the refund policy",
    config={"callbacks": [handler]}
)
# Every LLM call in your chain is now governed.
# Raises GuardianBlockedError if PII or jailbreak detected.

Guardian Analysis Sheet — Full Example

Every guardian.complete() call returns this:

>>> response.guardian_analysis
GuardianAnalysisSheet(
    input=InputAnalysis(
        tokens=312,
        estimated_cost=0.0187,
        pii_found=[],                   # empty = clean
        secrets_found=[],               # empty = clean
        jailbreak_detected=False,
        scope_check="in_scope",
        optimization_hint=None,
        status="ALLOWED"
    ),
    output=OutputAnalysis(
        tokens=89,
        actual_cost=0.0053,
        hallucination_risk="grounded",
        pii_found=[],
        profanity_found=False,
        status="CLEAN"
    ),
    session_budget={
        "spent": 0.024,
        "limit": 0.50,
        "remaining": 0.476,
        "utilization_pct": 4.8
    },
    agent_id="support-bot",
    timestamp=datetime(2026, 6, 1, 10, 30, 0)
)

CLI Reference

guardian init --key gdn_live_YOUR_KEY   # Store license key locally
guardian status                          # Show plan, usage, expiry
guardian validate policy.yaml           # Check YAML for errors
guardian logs                           # View local violation logs
guardian logs --tail 20 --severity high # Filter logs

Pricing

Plan Price Checks/month License Offline
Free $0 500 Key required
Starter $10/mo 10,000 Key unlocks limit
Pro $30/mo 100,000 Key unlocks limit
Enterprise Custom Unlimited Dedicated key

How limits work:

  • Counted locally in ~/.guardian/usage.json — no cloud required to enforce
  • Soft warning at 80% (printed in analysis sheet)
  • Soft block at 100% (returns upgrade prompt, never crashes your app)
  • Grace period: 7 days after key expiry before hard block — you're never left stranded mid-deployment
  • Enterprise: fully offline keys, no daily ping, air-gapped deployments supported

Annual plans: 2 months free. Overage: $0.005 per 1,000 extra checks.


Comparison

Feature Guardian Langfuse LangSmith Helicone Guardrails AI
Prompts stay local
Runtime blocking
PII detection — input
PII detection — output
Secret/API key detection
India DPDP Act (Aadhaar/PAN/UPI)
Jailbreak detection
Hallucination detection
Token budgets
Auto model routing
Loop detection
Tool governance
YAML policy engine
Per-agent scoping
Open source
Self-hosted
Traces & observability Basic

Global Compliance

Guardian enforces compliance rules without any data leaving your machine:

Regulation Jurisdiction Guardian Support
India DPDP Act 2023 India Native Aadhaar, PAN, UPI detection; consent tracking
GDPR EU PII blocking, data minimization, audit-ready local logs
HIPAA US Healthcare PHI detection, access logging, zero cloud upload
CCPA California PII detection, opt-out support
EU AI Act EU Risk logging, transparency records

Indian teams: Guardian was built in Pune, India. DPDP Act support is not bolted on — it's a first-class feature.


SDK Folder Structure

guardian/
├── core/
│   ├── engine.py          # Main orchestration (the guardian.complete() flow)
│   ├── policy.py          # YAML policy loader + Pydantic schema validation
│   ├── analysis.py        # GuardianAnalysisSheet builder
│   ├── license.py         # License key validator + daily sync (fail-open)
│   └── storage.py         # ~/.guardian/ local file manager
│
├── guards/
│   ├── input_guard.py     # Orchestrates all input checks
│   ├── output_guard.py    # Orchestrates all output checks
│   └── validators/
│       ├── pii.py         # PII regex detection + secret/credential scanning
│       ├── jailbreak.py   # 50+ pattern jailbreak detector
│       ├── hallucination.py  # LLM-as-judge checker
│       └── profanity.py   # Keyword-based content filter
│
├── finops/
│   ├── token_counter.py   # tiktoken wrapper
│   ├── cost_calculator.py # Model pricing tables (all major providers)
│   ├── budget_manager.py  # Per-agent budget enforcement
│   ├── loop_detector.py   # Recursive loop detection (difflib)
│   └── router.py          # Auto model downgrade logic
│
├── tools/
│   ├── tool_governor.py   # Allowlist/denylist + rate limiting
│   └── arg_validator.py   # Tool argument validation (regex/enum/type)
│
├── logging/
│   ├── logger.py          # Dispatches to configured sinks
│   └── sinks/
│       ├── jsonl.py       # ~/.guardian/logs/YYYY-MM-DD.jsonl
│       └── console.py     # Terminal pretty-print (ANSI colors)
│
├── integrations/
│   ├── openai_wrapper.py  # Wraps openai.chat.completions.create()
│   └── langchain.py       # LangChain BaseCallbackHandler
│
└── cli/
    ├── main.py            # CLI entry point (click group)
    ├── init.py            # `guardian init --key`
    ├── validate.py        # `guardian validate policy.yaml`
    ├── status.py          # `guardian status`
    └── logs.py            # `guardian logs`

Roadmap

Version Date What ships
v0.1.0 June 2026 PII detection (India DPDP + Global), Secret/API key detection, Jailbreak detection, Hallucination check, Cost tracking, YAML policies, OpenAI + LangChain integration, local JSONL logging
v0.2.0 August 2026 Tool governance, Auto model routing, Loop detection, CrewAI integration, Slack alerts
v0.3.0 October 2026 Developer portal (Next.js), License key server, Razorpay billing
v0.4.0 December 2026 Gemini + Autogen, Kubernetes operator, Enterprise offline keys

Why Open Source

Guardian Runtime is Apache-2.0 licensed because:

  1. A governance tool must be auditable. If Guardian controls what your AI can do, you need to see the code. Black-box governance is a contradiction.
  2. Compliance requires proof. Regulators need to verify that controls exist and work. Open source is the only acceptable proof.
  3. PII patterns and jailbreak techniques evolve fast. A community catches new attacks faster than any single team.
  4. No vendor lock-in. Works with OpenAI, Anthropic, Gemini, LangChain, CrewAI. Switch providers without touching your governance layer.

Contributing

git clone https://github.com/guardian-ai/guardian-runtime.git
cd guardian-runtime
python -m venv venv && source venv/bin/activate
pip install -e ".[dev]"
pytest tests/ -v       # run tests
ruff check guardian/   # lint

Good first contributions:

Area Examples
PII patterns Add new country-specific ID formats
Jailbreak corpus Add newly discovered jailbreak patterns
Cost tables Update model pricing as providers change
Integrations Add Autogen, Haystack, LlamaIndex support
Documentation Improve examples, fix typos

Contribution rules: Every validator must be testable in isolation (no LLM calls for unit tests). Every policy feature must be YAML-configurable. Every violation must be logged with full context.


Security

Found a vulnerability? Email security@guardian-ai.dev. Do not open a public GitHub issue for security disclosures.

See SECURITY.md for the full responsible disclosure policy.


Built with conviction in Pune, India 🇮🇳
guardian-ai.dev · GitHub · Discord · Twitter

Apache-2.0 · If Guardian prevents one PII leak, one jailbreak, or one $500 API bill — it was worth building.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

guardian_runtime-0.1.0.tar.gz (63.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

guardian_runtime-0.1.0-py3-none-any.whl (24.4 kB view details)

Uploaded Python 3

File details

Details for the file guardian_runtime-0.1.0.tar.gz.

File metadata

  • Download URL: guardian_runtime-0.1.0.tar.gz
  • Upload date:
  • Size: 63.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for guardian_runtime-0.1.0.tar.gz
Algorithm Hash digest
SHA256 571d236fe8ff3cc42faecb5795ce6ac79ec2374e6e43385cc7045e775d3e7c46
MD5 60d4f49a88359f9c611b4035e67b4d61
BLAKE2b-256 4cf1ec1e7f4a91a9e64059735e3019e8cd3cecd513fbd30294cf3969a935fca9

See more details on using hashes here.

File details

Details for the file guardian_runtime-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for guardian_runtime-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e8a353a07411e317573b4d90af60f129c6cfbe1ac6f0b12857a81d70c644e28a
MD5 a49325b117eba9be36d4c6d62a077861
BLAKE2b-256 994fec674b5926e65603e240fbc9d1c945e5c1546a96d29df585b82a9bd59b32

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page