Skip to main content

Code similarity firewall - blocks dangerous code patterns before they reach execution tools

Project description

Code Firewall MCP

A structural similarity-based code security filter for MCP (Model Context Protocol). Blocks dangerous code patterns before they reach execution tools by comparing code structure against a blacklist of known-bad patterns.

How It Works

┌──────────┐    ┌─────────────┐    ┌───────────────────┐    ┌─────────────────────┐
│ Code     │───▶│ CST → Embed │───▶│ Similarity Check  │───▶│ Execution Tools     │
│ (file)   │    │ (tree-sitter│    │ vs blacklist      │    │ (rlm_exec, etc.)    │
└──────────┘    │  + Ollama)  │    │ (ChromaDB)        │    └─────────────────────┘
                └─────────────┘    └─────────┬─────────┘
                                             │
                                    ┌────────┴────────┐
                                    ▼                 ▼
                               [BLOCKED]         [ALLOWED]
  1. Parse code to Concrete Syntax Tree (CST) using tree-sitter
  2. Normalize by stripping identifiers and literals → structural skeleton
  3. Embed the normalized structure via Ollama
  4. Compare against blacklisted patterns in ChromaDB
  5. Block if similarity exceeds threshold, otherwise allow

Key Insight

Code patterns like os.system("rm -rf /") and os.system("ls") have identical structure. By normalizing away the specific commands/identifiers, we can detect dangerous patterns regardless of the specific arguments used.

Installation

# Via uvx (recommended)
uvx code-firewall-mcp

# Or install from source
pip install -e .

Requirements

  • Python 3.10+
  • Ollama (for embeddings)
  • ChromaDB (for vector storage)
  • tree-sitter (optional, for better parsing)

Pull an embedding model:

ollama pull nomic-embed-text

Tools

firewall_check

Check if a code file is safe to pass to execution tools.

result = await firewall_check(file_path="/path/to/script.py")
# Returns: {allowed: bool, blocked: bool, similarity: float, ...}

firewall_check_code

Check code string directly (no file required).

result = await firewall_check_code(
    code="import os; os.system('rm -rf /')",
    language="python"
)

firewall_blacklist

Add a dangerous pattern to the blacklist.

result = await firewall_blacklist(
    code="os.system(arbitrary_command)",
    reason="Arbitrary command execution",
    severity="critical"
)

firewall_record_delta

Record near-miss variants to sharpen the classifier.

result = await firewall_record_delta(
    code="subprocess.run(['ls', '-la'])",
    similar_to="abc123",
    notes="Legitimate use case for file listing"
)

firewall_list_patterns

List patterns in the blacklist or delta collection.

firewall_remove_pattern

Remove a pattern from blacklist or deltas.

firewall_status

Get firewall status and statistics.

Configuration

Environment variables:

Variable Default Description
FIREWALL_DATA_DIR /tmp/code-firewall Data storage directory
OLLAMA_URL http://localhost:11434 Ollama server URL
EMBEDDING_MODEL nomic-embed-text Ollama embedding model
SIMILARITY_THRESHOLD 0.85 Block threshold (0-1)
NEAR_MISS_THRESHOLD 0.70 Near-miss recording threshold

Usage Pattern

Pre-filter for massive-context-mcp

Use code-firewall-mcp as a gatekeeper before passing code to rlm_exec:

# 1. Check code safety
check = await firewall_check_code(user_code)

if check["blocked"]:
    print(f"BLOCKED: {check['reason']}")
    return

# 2. If allowed, proceed with execution
result = await rlm_exec(code=user_code, context_name="my-context")

Building the Blacklist

The blacklist grows through use:

  1. Initial seeding: Add known dangerous patterns
  2. Audit feedback: When rlm_auto_analyze finds security issues, add patterns
  3. Delta sharpening: Record near-misses to improve classification boundaries
# After security audit finds issues
await firewall_blacklist(
    code=dangerous_code,
    reason="Command injection via subprocess",
    severity="critical"
)

Structural Normalization

The normalizer strips:

  • Identifiers: my_var_
  • String literals: "hello""S"
  • Numbers: 42N
  • Comments: Removed entirely

Example:

# Original
subprocess.run(["curl", url, "-o", output_file])

# Normalized
_._(["S", _, "S", _])

Both subprocess.run(["curl", ...]) and subprocess.run(["wget", ...]) normalize to the same structure, so blacklisting one catches both.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

code_firewall_mcp-0.5.0.tar.gz (176.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

code_firewall_mcp-0.5.0-py3-none-any.whl (15.7 kB view details)

Uploaded Python 3

File details

Details for the file code_firewall_mcp-0.5.0.tar.gz.

File metadata

  • Download URL: code_firewall_mcp-0.5.0.tar.gz
  • Upload date:
  • Size: 176.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for code_firewall_mcp-0.5.0.tar.gz
Algorithm Hash digest
SHA256 5536d7a9efbb4c17c0e3d921e5d19d080505c778e8b48127541afeecb708445f
MD5 867a6e876963cd3055c498f61aaef2e2
BLAKE2b-256 3849c6e8fe840f07d30a4dbdc417d1697e02df81718f0a79affe5b3c22e9b105

See more details on using hashes here.

Provenance

The following attestation bundles were made for code_firewall_mcp-0.5.0.tar.gz:

Publisher: release.yml on egoughnour/code-firewall-mcp

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file code_firewall_mcp-0.5.0-py3-none-any.whl.

File metadata

File hashes

Hashes for code_firewall_mcp-0.5.0-py3-none-any.whl
Algorithm Hash digest
SHA256 c505a765abcf06a87655806a2994af781bf147ec8c63bfa6afbc6f715f4ba02c
MD5 d81f9accdfdcb65cd3e01d08daa28fb0
BLAKE2b-256 6f32898a4a53e4fdd9a10e6b0df2b1c9c926aa40b4593b42f6f40413f91465c6

See more details on using hashes here.

Provenance

The following attestation bundles were made for code_firewall_mcp-0.5.0-py3-none-any.whl:

Publisher: release.yml on egoughnour/code-firewall-mcp

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page