Skip to main content

Local-first runtime governance layer for AI systems

Project description

GuardianRuntime

Guardian Runtime

A Zero-Latency FinOps & Security Firewall for AI Applications.
Intercept every prompt and response locally. Stop data leaks and runaway token costs.

PyPI Version Python Versions MIT License No signup required 100% Local


๐Ÿ›‘ The Problem: Data Privacy Black Boxes & Runaway Costs

Developers are building incredible AI applications, but they are often blindly passing raw user data to external APIs. If a user pastes a credit card into a chat, or a developer accidentally leaves an AWS Key in an agent prompt, that data is instantly transmitted and logged in a cloud provider's database. Furthermore, malicious users can inject Jailbreaks to hijack your AI.

Worse yet, unrestricted employee access to AI tools is causing massive budgeting crises. Companies are blowing through their annual LLM token budgets in mere months due to developers sending unoptimized, massive context windows and infinite loops to expensive models without oversight.

๐ŸŸข The Solution: A Zero-Latency FinOps Firewall

Guardian Runtime acts as an invisible shield sitting directly on your own infrastructure. Before a single byte of data leaves your network to reach OpenAI or Anthropic, Guardian scans, cleans, and optimizes it locally.

  • Data Security: It uses lightning-fast pattern matching to block PII, secrets, and jailbreaks in milliseconds.
  • Cost Control: The built-in Token Optimizer actively strips redundant whitespace and bloat from prompts.
  • FinOps Limits: Strict FinOps rules instantly block requests that exceed your maximum token budgetsโ€”stopping runaway spend in its tracks.

Everything happens locally on your CPU. It costs zero API fees and takes less than 5 milliseconds.


โšก Key Features

  1. ๐Ÿ”’ PII & Data Leak Prevention: Detects and blocks Aadhaar, PAN, SSNs, credit cards, emails, and phone numbers using local regex and NLP.
  2. ๐Ÿ”‘ Secret & Credential Scanning: Catches hardcoded API keys (AWS, OpenAI, GitHub) before they ever leave your machine.
  3. ๐Ÿ’ฐ Token Optimizer & FinOps: Compresses redundant whitespace and enforces maximum token budgets per request.
  4. ๐Ÿดโ€โ˜ ๏ธ Jailbreak Defense: Defends against 50+ known adversarial prompt patterns (e.g., "Ignore previous instructions", DAN payloads).
  5. ๐ŸŒ Universal Proxy: Works seamlessly with LangChain, Cursor IDE, Anthropic Claude Code, and any OpenAI-compatible client.
  6. ๐Ÿ“Š Local Dashboard & Audit Logs: Tracks every intercepted threat and token cost locally in ~/.guardian_runtime/logs/ with a beautiful built-in web dashboard.

๐Ÿ— Architecture

       ๐Ÿ‘ค USER INPUT / APP LOGIC
                 โ”‚
                 โ–ผ
 โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
 โ”‚ ๐Ÿ›ก GUARDIAN RUNTIME (Local Proxy)     โ”‚
 โ”‚                                      โ”‚
 โ”‚  1. Input Guard (PII/Secrets)        โ”‚ โ”€โ”€(Blocks Threats)
 โ”‚  2. Token Optimizer                  โ”‚ โ”€โ”€(Reduces Cost)
 โ”‚  3. FinOps Limits                    โ”‚ โ”€โ”€(Enforces Budgets)
 โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                 โ”‚ (Cleaned & Optimized)
                 โ–ผ
      โ˜๏ธ LLM API (OpenAI/Anthropic)
                 โ”‚
                 โ–ผ
 โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
 โ”‚ ๐Ÿ›ก GUARDIAN RUNTIME (Local Proxy)     โ”‚
 โ”‚                                      โ”‚
 โ”‚  1. Output Guard (Auditor)           โ”‚ โ”€โ”€(Flags Leaks/PII)
 โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                 โ”‚ (Safe Response)
                 โ–ผ
           ๐Ÿ’ป USER SCREEN

๐Ÿš€ Quickstart

Installation

pip install guardian-runtime
guardian_runtime init

Integration Methods

Guardian can be used as a drop-in Python SDK or as a Local HTTP Proxy for tools you can't edit.

Case 1: Custom Python Application (SDK)

Replace your direct LLM calls with the GuardianRuntime wrapper.

import os
from guardian_runtime import GuardianRuntime, GuardianRuntimeBlockedError

os.environ["OPENAI_API_KEY"] = "sk-proj-..."

# Loads FinOps and Security rules from policy.yaml
gr = GuardianRuntime.from_policy("policy.yaml")

try:
    response = gr.complete(
        messages=[{"role": "user", "content": "My AWS Key is AKIAIOSFODNN7EXAMPLE"}],
        raise_on_block=True
    )
    print(response.content)
except GuardianRuntimeBlockedError as e:
    print(f"Blocked Locally: {e.response.violations[0].detail}")

Case 2: Claude Code & CLI Assistants

For CLI tools like Anthropic's Claude Code, start the proxy and override the base URL.

# 1. Start the proxy in a background terminal
guardian_runtime proxy --port 8080

# 2. Tell Claude to route traffic through Guardian
export ANTHROPIC_BASE_URL=http://localhost:8080
claude

Case 3: Cursor IDE

Prevent accidental leaks of proprietary company secrets when using Cursor's AI Chat and Composer.

  1. Start the proxy: guardian_runtime proxy --port 8080
  2. Open Cursor Settings (Cmd+,)
  3. Go to Models > Override Base URL
  4. Set it to: http://localhost:8080

Case 4: Agentic Frameworks (LangChain / AutoGen)

Building autonomous agents? Guardian acts as a security middleware for any standard LLM client.

from langchain_openai import ChatOpenAI

# Point LangChain to the Guardian Proxy
llm = ChatOpenAI(
    model="gpt-4o",
    base_url="http://localhost:8080"
)

Case 5: Document Analysis (RAG)

Heavy PDFs contain massive amounts of formatting bloat. Use the Document Converter to clean and compress them before the LLM sees them.

from guardian_runtime import convert_document

doc = convert_document("financial_report.pdf")
print(doc.token_count) # See exactly how much context it uses
print(doc.content)     # Feed pure Markdown to your RAG

โš™๏ธ Configuration (policy.yaml)

Define your security thresholds and budget rules without touching your code.

version: "1.0"
name: "production"
interactive_mode: off

agents:
  default:
    llm:
      provider: openai
      default_model: gpt-4o

    input_guard:
      pii_detection: true
      jailbreak_detection: true
      pii_action: block 

    optimizer:
      enabled: true
      whitespace_normalization: true
      
    cost:
      max_input_tokens: 50000   # Instantly blocks massive context windows
      max_output_tokens: 4000

๐Ÿ” Output Auditing (Non-Blocking)

By default, the Input Guard acts as a strict firewallโ€”blocking requests containing secrets or PII before they cost you money.

The Output Guard, however, acts as an Auditor. If an LLM accidentally hallucinates an internal API key or PII in its response, Guardian will not drop the response. Instead, it passes the message back to your application but attaches a list of violations to the response object. This allows your application to handle the mistake gracefully on the frontend.


๐Ÿ“ˆ CLI Tools & Dashboard

Guardian ships with built-in tools for local observability. All logs are stored strictly on your local machine in ~/.guardian_runtime/logs/.

# View live intercepted traffic
guardian_runtime logs --tail 20

# Check total session cost
guardian_runtime status

# Launch the local FinOps & Security dashboard
guardian_runtime dashboard

๐Ÿ“œ License

Released under the MIT License โ€” free to use, modify, and distribute. Zero tracking, zero cloud dependencies.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

guardian_runtime-1.0.0.tar.gz (80.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

guardian_runtime-1.0.0-py3-none-any.whl (50.6 kB view details)

Uploaded Python 3

File details

Details for the file guardian_runtime-1.0.0.tar.gz.

File metadata

  • Download URL: guardian_runtime-1.0.0.tar.gz
  • Upload date:
  • Size: 80.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for guardian_runtime-1.0.0.tar.gz
Algorithm Hash digest
SHA256 1d75f82e803be9617b8bc5fc6e775661fb06465fb937b5e96a720e6c0b15bec2
MD5 7c99d37dfccd5b143e115c5829f1e4ef
BLAKE2b-256 184a83d3fa631de8b38da9b807d2347392034e949b0f2c9c3886ac737e25fb44

See more details on using hashes here.

File details

Details for the file guardian_runtime-1.0.0-py3-none-any.whl.

File metadata

File hashes

Hashes for guardian_runtime-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 a2151e5a6879e272d0054e2f183f52fd388b9f736cd346eba92b1da9291066f7
MD5 4e75fa0f214454871364f9aaaed38465
BLAKE2b-256 8b77fa3e7d3b6fef00b7434c9e42286b99bd07a9c065438a68ec9e0d9422f20d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page