Skip to main content

A lightweight, developer-friendly firewall for LLM prompts. Prevents sensitive data from reaching AI models.

Project description

🛡️ contextfirewall

A zero-dependency, ultra-fast Python firewall for LLM prompts.

contextfirewall sits between your application and Large Language Models (LLMs) like OpenAI, Claude, Mistral, or Llama. It natively scans user prompts, detects sensitive/confidential data (PII, Secrets, Financials), and decides instantly whether the prompt should be allowed, redacted, or blocked.

PyPI Python License: MIT


📖 Table of Contents

  1. Why ContextFirewall?
  2. Installation
  3. Quick Start
  4. LLM Integrations (OpenAI & Anthropic)
  5. Core Concepts
  6. Built-in Detectors & Entropy
  7. Compliance Presets
  8. Customization

⚡ Why ContextFirewall?

In production AI apps, sending raw user input directly to an LLM provider is a massive liability.

  • Security: Users may accidentally paste massive environment variables, private keys, or API tokens.
  • Privacy (PII): If you build B2B/B2C SaaS, passing user emails, SSNs, or credit cards violates compliance protocols.
  • Zero-Dependency Engine: We calculate Shannon algorithms and regexes pure-Python, meaning it works rapidly offline without installing ML models, heavy binaries, or numpy.

📦 Installation

pip install contextfirewall

🚀 Quick Start

The core function is check(text, **config). It evaluates your prompt and returns a Result object.

from contextfirewall import check

text = "My email is alice@example.com and my API key is sk-123abc456def789ghijklmnop"
result = check(text)

# Inspect what action to take:
print(result.action)   
# -> "BLOCK"

# See exactly what was flagged:
print(result.found)    
# -> ["EMAIL", "STRIPE_KEY"]

# See the mathematical risk score (1-10):
print(result.score)    
# -> 9

# Get a human-readable explanation:
print(result.reason)   
# -> "High-risk data detected (EMAIL, STRIPE_KEY), score 9 — blocked by STANDARD policy"

🌐 LLM Integrations

contextfirewall acts as your middleman. Here is exactly how to safely wrap your AI API calls:

1. The Middleware Approach (OpenAI)

Catch dangerous data before it even leaves your servers. We utilize the .cleaned property to pass a safe, redacted version to the LLM.

from openai import OpenAI
from contextfirewall import check, Mode

client = OpenAI()

def safe_chat(user_prompt: str) -> str:
    # 1. Inspect the prompt
    result = check(user_prompt, mode=Mode.STANDARD)
    
    # 2. Hard block malicious or extremely sensitive inputs
    if result.action == "BLOCK":
        return f"System Error: Message blocked. Reason: {result.reason}"
    
    # 3. If it requires redaction, use the cleaned prompt. Otherwise use the original.
    safe_prompt = result.cleaned if result.action == "REDACT" else user_prompt
    
    # 4. Safely request the LLM
    response = client.chat.completions.create(
        model="gpt-4-turbo",
        messages=[{"role": "user", "content": safe_prompt}],
    )
    return response.choices[0].message.content

2. The Decorator Approach (Anthropic / Claude)

You can protect entire functions automatically using our pythonic @firewall decorator.

import anthropic
from contextfirewall import firewall

client = anthropic.Anthropic()

# Automatically intercept and redact any positional `prompt` argument!
@firewall(mode="REDACT_ALL")
def safe_claude_call(prompt: str):
    response = client.messages.create(
        model="claude-3-opus-20240229",
        max_tokens=1024,
        messages=[{"role": "user", "content": prompt}]
    )
    return response.content[0].text

# You pass: "Send to alice@example.com"
# Claude actually sees: "Send to [EMAIL_REDACTED]"

🧠 Core Concepts

1. Scan Results

The output of check() is a rich Result dataclass that provides total observability into what occurred:

Property Type Description
action str Either "ALLOW", "REDACT", "BLOCK", or "AUDIT".
score int The calculated aggregate risk score (between 1 and 10).
found list[str] A list of detected types (e.g. ["EMAIL", "IP_ADDRESS"]).
findings list[Finding] Raw metadata objects displaying the exact exact string matched and its sub-context.
cleaned str | None The fully redacted prompt (populated only if action is "REDACT").

2. Classification Modes

Control how aggressively the firewall behaves using the mode parameter.

Mode Behavior Description
Mode.STANDARD (Default) Allows low-risk (e.g. public URLs), redacts mid-risk (e.g. emails), blocks high-risk (e.g. API keys).
Mode.STRICT Blocks the prompt entirely if anything sensitive is found.
Mode.PERMISSIVE Only blocks extreme risk (score >= 8). Allows everything else through.
Mode.REDACT_ALL Never blocks. Simply redacts EVERYTHING sensitive it finds.
Mode.AUDIT Never blocks and never redacts. Logs the findings and returns an "AUDIT" action.
Mode.PARANOID Engages unstructured Shannon log2 Entropy scanners to find generic unknown secrets.
from contextfirewall import check, Mode
res = check(text, mode=Mode.STRICT)

3. Redaction Styles

When returning a .cleaned string, how should the data be masked?

Style What it Looks Like
RedactStyle.PLACEHOLDER Send to [EMAIL_REDACTED]. (Default)
RedactStyle.CATEGORY Send to [PII_REDACTED].
RedactStyle.MASK Send to a***@example.com. (Intelligently keeps partial structure)
RedactStyle.HASH Send to [HASH_8a92b3c1]. (One-way cryptographic hash)
from contextfirewall import check, RedactStyle
res = check("Call 555-0192", mode="REDACT_ALL", redact_style=RedactStyle.MASK)
# -> "Call 5*******2"

4. Scoring Strategies

How the mathematical risk is calculated out of an index of 10.

  • ScoringStrategy.WEIGHTED: (Default) Takes the maximum risk finding, then adds minor mathematical penalties if there is a high density (lots of findings crammed together) and high variety (many different types of secrets). This ensures prompt-injection attempts score higher.
  • ScoringStrategy.MAX: Simply uses the score of the highest risk item.
  • ScoringStrategy.MEAN: The mathematical average of all findings.

🔍 Built-in Detectors

The library detects over 25+ exact entities deterministically.

Personal Identifiable Information (PII)

  • Emails, Phone Numbers, Social Security Numbers (SSN), Passports, Date of Birth.

Secrets & Keys

  • Standard API Keys, JWT Tokens, GitHub Tokens, Slack Tokens, Google Keys, AWS Keys, RSA Private Keys.

Infrastructure

  • Local Internal URLs, Database Connection Strings (Postgres/Redis), Private IPs (10.x, 192.168.x).

Financial & Medical

  • Credit Cards, IBANs, Stripe Keys, Patient Medical Record Numbers (MRN).

Catching The Unknown (Paranoia Entropy)

If a user pastes a highly sensitive password or token that doesn't fit a standard API format, regex will miss it. By turning on Paranoid Mode, contextfirewall leverages mathematical Shannon log2 Entropy to find strings with high randomness (e.g. XyZ123!@LmnOpQrS456) and flags them.

res = check("Here is my deployment password: XyZ123!@LmnOpQrS456", mode=Mode.PARANOID)
# Blocked! Category: SECRET. SubType: HIGH_ENTROPY

🛡️ Compliance Presets

If you are building apps for heavily regulated industries, you don't need to manually configure rules. Use our 1-click presets:

from contextfirewall.presets import HIPAA, PCI_DSS, SOC2, GDPR

# HIPAA Preset enforces immediate Blocking on Medical Records & SSNs
# and strictly enforces high-redaction on all other data.
result = check(patient_text, preset=HIPAA)

🔌 Deep Customization

1. Add Custom Regex Policies

If you need to block internal company data (like employee IDs), build a rule:

from contextfirewall import add_rule

# Give it a name, a regex pattern, and a severity level between 1-10.
add_rule(name="EMPLOYEE_ID", pattern=r"EMP-\d+", risk=5)

2. Add Custom Python Evaluators

If you want to dynamically evaluate text against an external list without regex, you can write python hooks!

from contextfirewall import add_detector

def detect_competitors(text):
    competitors = ["AcmeCorp", "GlobalTech"]
    found = [c for c in competitors if c in text]
    return found

add_detector(name="COMPETITOR_MENTION", func=detect_competitors, risk=3)

3. Global Configurations

Define your configurations once on application startup so you don't have to keep passing arguments:

from contextfirewall import set_mode, set_redact_style

set_mode(Mode.STRICT)
set_redact_style(RedactStyle.HASH)

📊 Analytics & Reporting

If you want to view why a prompt failed natively in your terminal, use .explain():

from contextfirewall import explain
print(explain("My JWT is eyJhbG..."))

Output:

ContextFirewall Report
----------------------
Action:     BLOCK
Mode:       STANDARD
Score:      7 (via WEIGHTED)
Density:    55.56 items/1k chars
Duration:   0.14 ms

Reason:     High-risk data detected (JWT), score 7 — blocked by STANDARD policy

Findings (1):
  1. [HIGH] JWT (risk: 7)
     Context: "...My JWT is eyJhbG..."

📄 License

Open-source under the MIT License — © Suhaib Bin Younis

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

contextfirewall-0.2.1.tar.gz (27.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

contextfirewall-0.2.1-py3-none-any.whl (23.5 kB view details)

Uploaded Python 3

File details

Details for the file contextfirewall-0.2.1.tar.gz.

File metadata

  • Download URL: contextfirewall-0.2.1.tar.gz
  • Upload date:
  • Size: 27.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for contextfirewall-0.2.1.tar.gz
Algorithm Hash digest
SHA256 66c0059739186e990e167e314ae2beef2ec7ffd68c7dabbdf7d053b2b09671a7
MD5 e6032453f1cc2e61546bffd6873570f0
BLAKE2b-256 91fe7621a0a60ce77633dda0357d5050f0551fe849c22d11467797f0a02693e5

See more details on using hashes here.

File details

Details for the file contextfirewall-0.2.1-py3-none-any.whl.

File metadata

File hashes

Hashes for contextfirewall-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 4aa730b4b199142661617da7695bd37e7e4d365ee3190d5ed0edbc94e5b191fb
MD5 d933ff0f25b3672bf0bd2bc421ccc210
BLAKE2b-256 a11eacfc3da3409d94a2140a0cdc41b30673bc2aaf8877820ee203a5cea9bf87

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page