Skip to main content

Local-first runtime governance layer for AI systems

Project description

GuardianRuntime

Guardian Runtime

A Zero-Latency FinOps & Security Firewall for AI Applications.
Intercept every prompt and response locally. Stop data leaks and runaway token costs.

Buy Me A Coffee Python Versions MIT License

๐ŸŒ Website & Docs: https://ashp15205.github.io/guardian-runtime/
๐Ÿ“ฆ Available on PyPI: https://pypi.org/project/guardian-runtime/


๐Ÿ“– Table of Contents


๐Ÿ›‘ The Problem: Developers are Flying Blind

  1. The Cost Risk: CLI coding agents (Claude Code, Cursor, Aider) run autonomously. If they get stuck in an infinite retry loop or parse a massive log file, you wake up to a $50 API bill. You have zero visibility into session costs until the bill arrives.
  2. The Security Risk: Coding agents have full access to your workspace. If you accidentally leave an AWS_SECRET_KEY or .env credential in a file, the agent will silently upload it to a third-party LLM provider.

๐ŸŸข The Solution: A Developer-First Local Firewall

  • ML-Powered PII & Secret Scanning: Uses Microsoft Presidio for high-accuracy NLP scanning (emails, phones, SSNs) and rigorous Regex fallbacks for secrets (AWS keys, OpenAI keys). Runs 100% locally with zero latency.
  • Jailbreak Detection: Pre-emptively blocks DAN prompts and instruction-override injections.
  • High-Concurrency Threadpool Proxy: The local proxy seamlessly handles hundreds of simultaneous requests with zero event-loop blocking, making it perfect for multi-agent terminal systems.
  • Graceful Upstream Error Handling: Mid-stream LLM API outages are handled beautifully, keeping your terminal bots alive instead of crashing them with 500 errors.
  • Session Analytics & Hard Budgets: Automatically tracks tokens and costs per session via the CLI. It sets a hard $10/day default limit so infinite loops never drain your credit card.
  • Local Secret Scanning: Instantly intercepts and blocks API keys, AWS credentials, and .env secrets from ever leaving your local machine.
  • Zero Config: No complex policies required. It protects your budget and secrets out of the box.

โšก Key Features

  1. ๐Ÿ’ฐ Custom Hard Budgets: Configure a strict daily budget so runaway agents can't drain your API credits.
  2. ๐Ÿ”‘ Secret & Credential Firewall: Catches hardcoded API keys (AWS, Stripe, OpenAI, GitHub) before they leave your laptop.
  3. ๐Ÿ“‰ Token Optimizer: Compresses redundant whitespace and reduces prompt bloat to passively save you money.
  4. ๐ŸŒ Universal Local Proxy: Works seamlessly with CLI agents like Anthropic Claude Code and Aider.
  5. ๐Ÿดโ€โ˜ ๏ธ Unsafe Command Defense: Stops adversarial prompts from hijacking your agent to run malicious CLI commands.
  6. ๐Ÿ“Š Built-in Local Dashboard: Tracks every intercepted threat and every cent spent locally in ~/.guardian_runtime/logs/ with a beautiful offline dashboard.

๐Ÿ— Architecture

       ๐Ÿ‘ค USER INPUT / APP LOGIC
                 โ”‚
                 โ–ผ
 โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
 โ”‚   GUARDIAN RUNTIME (Local Proxy)     โ”‚
 โ”‚                                      โ”‚
 โ”‚  1. Input Guard (Secret Scanner)     โ”‚ โ”€โ”€(Blocks Threats)
 โ”‚  2. Token Optimizer                  โ”‚ โ”€โ”€(Reduces Cost)
 โ”‚  3. FinOps Limits                    โ”‚ โ”€โ”€(Enforces Budgets)
 โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                 โ”‚ (Cleaned & Optimized)
                 โ–ผ
      โ˜๏ธ LLM API (OpenAI/Anthropic)
                 โ”‚
                 โ–ผ
 โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
 โ”‚  GUARDIAN RUNTIME (Local Proxy)      โ”‚
 โ”‚                                      โ”‚
 โ”‚  1. Output Guard (Auditor)           โ”‚ โ”€โ”€(Flags Secrets)
 โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                 โ”‚ (Safe Response)
                 โ–ผ
           ๐Ÿ’ป USER SCREEN

๐Ÿš€ Quickstart

Installation

# Core framework only
pip install guardian-runtime

# Or install with specific LLM providers:
pip install "guardian-runtime[openai]"
pip install "guardian-runtime[anthropic]"
pip install "guardian-runtime[google]"

# Or install everything (Providers, PII ML Scanner, Doc Converter):
pip install "guardian-runtime[all]"

Done. No signup, no keys, zero configuration required.

Integration Methods

Guardian can be used as a drop-in Python SDK or as a Local HTTP Proxy for tools you can't edit.

Case 1: Custom Python Application (SDK)

Replace your direct LLM calls with the GuardianRuntime wrapper. Works instantly with zero configuration.

import os
from guardian_runtime import GuardianRuntime, GuardianRuntimeBlockedError

os.environ["OPENAI_API_KEY"] = "sk-proj-..."

# Zero-config initialization
gr = GuardianRuntime()

try:
    response = gr.complete(
        messages=[{"role": "user", "content": "My AWS Key is AKIAIOSFODNN7EXAMPLE"}],
        raise_on_block=True
    )
    print(response.content)
except GuardianRuntimeBlockedError as e:
    print(f"Blocked Locally: {e.response.violations[0].detail}")

Case 2: Claude Code & CLI Assistants

For CLI tools like Anthropic's Claude Code, start the proxy and override the base URL.

# 1. Start the proxy in a background terminal
guardian_runtime proxy --port 8080

# 2. Tell Claude to route traffic through Guardian
export ANTHROPIC_BASE_URL=http://localhost:8080
claude

Case 3: Cursor IDE (Coming Soon)

We are actively working on full support for Cursor's AI Chat and Composer.

  1. Start the proxy: guardian_runtime proxy --port 8080
  2. Open Cursor Settings (Cmd+,)
  3. Go to Models > Override Base URL
  4. Set it to: http://localhost:8080 (Note: May exhibit unstable behavior in current version)

Case 4: Agentic Frameworks (LangChain / AutoGen)

Building autonomous agents? Guardian acts as a security middleware for any standard LLM client.

from langchain_openai import ChatOpenAI

# Point LangChain to the Guardian Proxy
llm = ChatOpenAI(
    model="gpt-4o",
    base_url="http://localhost:8080"
)

Case 5: Document Analysis (RAG)

Heavy PDFs contain massive amounts of formatting bloat. Use the Document Converter to clean and compress them before the LLM sees them.

from guardian_runtime import convert_document

doc = convert_document("financial_report.pdf")
print(doc.token_count) # See exactly how much context it uses
print(doc.content)     # Feed pure Markdown to your RAG

๐Ÿ›‘ What happens when Guardian blocks a request?

When Guardian detects a Secret or a Budget Violation, it halts the request immediately.

Where will I see the block?

  • If using the Proxy: You will see the block in the terminal running guardian_runtime proxy, AND inside the UI of the tool you are using (e.g., Claude Code or Aider).
  • If using the Python SDK: It surfaces instantly in your standard Python server logs or terminal.

How is it blocked?

  • Proxy Mode: Guardian returns a graceful HTTP 400/403 error. This ensures CLI agents display a clean error message in their chat interface instead of crashing or freezing your session.
  • SDK Mode: Guardian raises a GuardianRuntimeBlockedError exception that can be cleanly caught in a standard try/except block.

What will I see? You will see a completely transparent, actionable error message. No obscure stack traces.

  • Example (Budget): BadRequestError: ๐Ÿšจ [BUDGET_EXCEEDED] Daily budget of $10.00 exceeded.
  • Example (Secret): Error: HTTP 403. ๐Ÿšจ [SECRET_DETECTED] AWS key AKIAIOS... found.

โš™๏ธ Advanced Configuration (Optional)

Guardian Runtime works perfectly out of the box for independent developers. But if you want to customize strict budgets or scan for custom secrets, you can create an optional policy.yaml:

guardian_runtime init
version: "1.0"
name: "production"
interactive_mode: off

agents:
  default:
    llm:
      provider: openai
      default_model: gpt-4o

    input_guard:
      scanner_enabled: true
      jailbreak_detection: true
      scanner_action: block 

    optimizer:
      enabled: true
      whitespace_normalization: true
      
    cost:
      daily_budget: 10.00       # Instantly blocks if daily spend exceeds $10.00
      max_input_tokens: 50000   # Instantly blocks massive context windows
      max_output_tokens: 4000

๐Ÿ” Output Auditing (Non-Blocking)

By default, the Input Guard acts as a strict firewallโ€”blocking requests containing secrets before they cost you money.

The Output Guard, however, acts as an Auditor. If an LLM accidentally hallucinates an internal API key in its response, Guardian will not drop the response. Instead, it passes the message back to your application but attaches a list of violations to the response object. This allows your application to handle the mistake gracefully on the frontend.


๐Ÿ“ˆ CLI Tools & Dashboard

Guardian ships with built-in tools for local observability. All logs are stored strictly on your local machine in ~/.guardian_runtime/logs/.

# View live intercepted traffic
guardian_runtime logs --tail 20

# View Session Analytics (Cost & Tokens per CLI tool)
guardian_runtime analytics

# Launch the full local FinOps & Security dashboard
guardian_runtime dashboard

Example Analytics Output:

  โ›จ  GuardianRuntime Session Analytics (Today)
  โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€

  Claude Code
  Cost:       $2.3100
  Requests:   54
  Blocked:    3 (3 secret_detected)
  Tokens:     82,000

๐Ÿ“œ License

Released under the MIT License โ€” free to use, modify, and distribute. Zero tracking, zero cloud dependencies.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

guardian_runtime-1.0.9.tar.gz (156.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

guardian_runtime-1.0.9-py3-none-any.whl (55.9 kB view details)

Uploaded Python 3

File details

Details for the file guardian_runtime-1.0.9.tar.gz.

File metadata

  • Download URL: guardian_runtime-1.0.9.tar.gz
  • Upload date:
  • Size: 156.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for guardian_runtime-1.0.9.tar.gz
Algorithm Hash digest
SHA256 a07a5a8e1e1f95d24e2d2c2db896d2e638f3a96a4283031dcfa8b83c94bdbf48
MD5 115ac29b8de924b1b865020dbe2929ec
BLAKE2b-256 2ad3769747a89a922e82fb4c48bf3a2af63218bd0a13147b1ae72d305abd93f5

See more details on using hashes here.

File details

Details for the file guardian_runtime-1.0.9-py3-none-any.whl.

File metadata

File hashes

Hashes for guardian_runtime-1.0.9-py3-none-any.whl
Algorithm Hash digest
SHA256 7c51344b2b9c37d58baac2543420b3c3d7fae5294d412500e0dd86f89baec378
MD5 302be55524c55e13173d9ce8bd7ee8fc
BLAKE2b-256 91244d470dd9f245657d99e37b7ff5d1e6c9d5bc39e82828a865d4adc630a06a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page