Skip to main content

Code similarity firewall - blocks dangerous code patterns before they reach execution tools

Project description

Code Firewall MCP

A structural similarity-based code security filter for MCP (Model Context Protocol). Blocks dangerous code patterns before they reach execution tools by comparing code structure against a blacklist of known-bad patterns.

How It Works

┌──────────┐    ┌─────────────┐    ┌───────────────────┐    ┌─────────────────────┐
│ Code     │───▶│ CST → Embed │───▶│ Similarity Check  │───▶│ Execution Tools     │
│ (file)   │    │ (tree-sitter│    │ vs blacklist      │    │ (rlm_exec, etc.)    │
└──────────┘    │  + Ollama)  │    │ (ChromaDB)        │    └─────────────────────┘
                └─────────────┘    └─────────┬─────────┘
                                             │
                                    ┌────────┴────────┐
                                    ▼                 ▼
                               [BLOCKED]         [ALLOWED]
  1. Parse code to Concrete Syntax Tree (CST) using tree-sitter
  2. Normalize by stripping identifiers and literals → structural skeleton
  3. Embed the normalized structure via Ollama
  4. Compare against blacklisted patterns in ChromaDB
  5. Block if similarity exceeds threshold, otherwise allow

Key Insight

Code patterns like os.system("rm -rf /") and os.system("ls") have identical structure. By normalizing away the specific commands/identifiers, we can detect dangerous patterns regardless of the specific arguments used.

Installation

# Via uvx (recommended)
uvx code-firewall-mcp

# Or install from source
pip install -e .

Requirements

  • Python 3.10+
  • Ollama (for embeddings)
  • ChromaDB (for vector storage)
  • tree-sitter (optional, for better parsing)

Pull an embedding model:

ollama pull nomic-embed-text

Tools

firewall_check

Check if a code file is safe to pass to execution tools.

result = await firewall_check(file_path="/path/to/script.py")
# Returns: {allowed: bool, blocked: bool, similarity: float, ...}

firewall_check_code

Check code string directly (no file required).

result = await firewall_check_code(
    code="import os; os.system('rm -rf /')",
    language="python"
)

firewall_blacklist

Add a dangerous pattern to the blacklist.

result = await firewall_blacklist(
    code="os.system(arbitrary_command)",
    reason="Arbitrary command execution",
    severity="critical"
)

firewall_record_delta

Record near-miss variants to sharpen the classifier.

result = await firewall_record_delta(
    code="subprocess.run(['ls', '-la'])",
    similar_to="abc123",
    notes="Legitimate use case for file listing"
)

firewall_list_patterns

List patterns in the blacklist or delta collection.

firewall_remove_pattern

Remove a pattern from blacklist or deltas.

firewall_status

Get firewall status and statistics.

Configuration

Environment variables:

Variable Default Description
FIREWALL_DATA_DIR /tmp/code-firewall Data storage directory
OLLAMA_URL http://localhost:11434 Ollama server URL
EMBEDDING_MODEL nomic-embed-text Ollama embedding model
SIMILARITY_THRESHOLD 0.85 Block threshold (0-1)
NEAR_MISS_THRESHOLD 0.70 Near-miss recording threshold

Usage Pattern

Pre-filter for massive-context-mcp

Use code-firewall-mcp as a gatekeeper before passing code to rlm_exec:

# 1. Check code safety
check = await firewall_check_code(user_code)

if check["blocked"]:
    print(f"BLOCKED: {check['reason']}")
    return

# 2. If allowed, proceed with execution
result = await rlm_exec(code=user_code, context_name="my-context")

Building the Blacklist

The blacklist grows through use:

  1. Initial seeding: Add known dangerous patterns
  2. Audit feedback: When rlm_auto_analyze finds security issues, add patterns
  3. Delta sharpening: Record near-misses to improve classification boundaries
# After security audit finds issues
await firewall_blacklist(
    code=dangerous_code,
    reason="Command injection via subprocess",
    severity="critical"
)

Structural Normalization

The normalizer strips:

  • Identifiers: my_var_
  • String literals: "hello""S"
  • Numbers: 42N
  • Comments: Removed entirely

Example:

# Original
subprocess.run(["curl", url, "-o", output_file])

# Normalized
_._(["S", _, "S", _])

Both subprocess.run(["curl", ...]) and subprocess.run(["wget", ...]) normalize to the same structure, so blacklisting one catches both.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

code_firewall_mcp-0.6.0.tar.gz (176.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

code_firewall_mcp-0.6.0-py3-none-any.whl (15.7 kB view details)

Uploaded Python 3

File details

Details for the file code_firewall_mcp-0.6.0.tar.gz.

File metadata

  • Download URL: code_firewall_mcp-0.6.0.tar.gz
  • Upload date:
  • Size: 176.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for code_firewall_mcp-0.6.0.tar.gz
Algorithm Hash digest
SHA256 c299bce8513af7ad523f07a48ac7e009a495d0e0d4f6bff14a7a61b9e3c73a5f
MD5 ec7323965b5ad61a879e1840b15caa6c
BLAKE2b-256 9b4ed7874d202f364715f731d29a9c65bd7dc5fba511c29c99a499e636241ad1

See more details on using hashes here.

Provenance

The following attestation bundles were made for code_firewall_mcp-0.6.0.tar.gz:

Publisher: release.yml on egoughnour/code-firewall-mcp

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file code_firewall_mcp-0.6.0-py3-none-any.whl.

File metadata

File hashes

Hashes for code_firewall_mcp-0.6.0-py3-none-any.whl
Algorithm Hash digest
SHA256 436e50841a521f8b358f656b498bbe657e8a615c3d9dec001cb7dc4068ea0a01
MD5 fa19f967867a0879c9dbb5ff7c8fb12d
BLAKE2b-256 8d9bb362a3e1169dd99bcde7d743e81e49a4b6e3e603176378528821a29b5516

See more details on using hashes here.

Provenance

The following attestation bundles were made for code_firewall_mcp-0.6.0-py3-none-any.whl:

Publisher: release.yml on egoughnour/code-firewall-mcp

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page