Code similarity firewall - blocks dangerous code patterns before they reach execution tools
Project description
Code Firewall MCP
A structural similarity-based code security filter for MCP (Model Context Protocol). Blocks dangerous code patterns before they reach execution tools by comparing code structure against a blacklist of known-bad patterns.
How It Works
┌──────────┐ ┌─────────────┐ ┌───────────────────┐ ┌─────────────────────┐
│ Code │───▶│ CST → Embed │───▶│ Similarity Check │───▶│ Execution Tools │
│ (file) │ │ (tree-sitter│ │ vs blacklist │ │ (rlm_exec, etc.) │
└──────────┘ │ + Ollama) │ │ (ChromaDB) │ └─────────────────────┘
└─────────────┘ └─────────┬─────────┘
│
┌────────┴────────┐
▼ ▼
[BLOCKED] [ALLOWED]
- Parse code to Concrete Syntax Tree (CST) using tree-sitter
- Normalize by stripping identifiers and literals → structural skeleton
- Embed the normalized structure via Ollama
- Compare against blacklisted patterns in ChromaDB
- Block if similarity exceeds threshold, otherwise allow
Key Insight
Code patterns like os.system("rm -rf /") and os.system("ls") have identical structure. By normalizing away the specific commands/identifiers, we can detect dangerous patterns regardless of the specific arguments used.
Installation
# Via uvx (recommended)
uvx code-firewall-mcp
# Or install from source
pip install -e .
Requirements
- Python 3.10+
- Ollama (for embeddings)
- ChromaDB (for vector storage)
- tree-sitter (optional, for better parsing)
Pull an embedding model:
ollama pull nomic-embed-text
Tools
firewall_check
Check if a code file is safe to pass to execution tools.
result = await firewall_check(file_path="/path/to/script.py")
# Returns: {allowed: bool, blocked: bool, similarity: float, ...}
firewall_check_code
Check code string directly (no file required).
result = await firewall_check_code(
code="import os; os.system('rm -rf /')",
language="python"
)
firewall_blacklist
Add a dangerous pattern to the blacklist.
result = await firewall_blacklist(
code="os.system(arbitrary_command)",
reason="Arbitrary command execution",
severity="critical"
)
firewall_record_delta
Record near-miss variants to sharpen the classifier.
result = await firewall_record_delta(
code="subprocess.run(['ls', '-la'])",
similar_to="abc123",
notes="Legitimate use case for file listing"
)
firewall_list_patterns
List patterns in the blacklist or delta collection.
firewall_remove_pattern
Remove a pattern from blacklist or deltas.
firewall_status
Get firewall status and statistics.
Configuration
Environment variables:
| Variable | Default | Description |
|---|---|---|
FIREWALL_DATA_DIR |
/tmp/code-firewall |
Data storage directory |
OLLAMA_URL |
http://localhost:11434 |
Ollama server URL |
EMBEDDING_MODEL |
nomic-embed-text |
Ollama embedding model |
SIMILARITY_THRESHOLD |
0.85 |
Block threshold (0-1) |
NEAR_MISS_THRESHOLD |
0.70 |
Near-miss recording threshold |
Usage Pattern
Pre-filter for massive-context-mcp
Use code-firewall-mcp as a gatekeeper before passing code to rlm_exec:
# 1. Check code safety
check = await firewall_check_code(user_code)
if check["blocked"]:
print(f"BLOCKED: {check['reason']}")
return
# 2. If allowed, proceed with execution
result = await rlm_exec(code=user_code, context_name="my-context")
Building the Blacklist
The blacklist grows through use:
- Initial seeding: Add known dangerous patterns
- Audit feedback: When
rlm_auto_analyzefinds security issues, add patterns - Delta sharpening: Record near-misses to improve classification boundaries
# After security audit finds issues
await firewall_blacklist(
code=dangerous_code,
reason="Command injection via subprocess",
severity="critical"
)
Structural Normalization
The normalizer strips:
- Identifiers:
my_var→_ - String literals:
"hello"→"S" - Numbers:
42→N - Comments: Removed entirely
Example:
# Original
subprocess.run(["curl", url, "-o", output_file])
# Normalized
_._(["S", _, "S", _])
Both subprocess.run(["curl", ...]) and subprocess.run(["wget", ...]) normalize to the same structure, so blacklisting one catches both.
License
MIT
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file code_firewall_mcp-0.4.0.tar.gz.
File metadata
- Download URL: code_firewall_mcp-0.4.0.tar.gz
- Upload date:
- Size: 176.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
103260cfaa1ca81305990f6b7dbb654a293ca9e6009f8c36997f4ec4e2bf0890
|
|
| MD5 |
8d8c362b6e94bf40bede4820d1da81f4
|
|
| BLAKE2b-256 |
82d7b20a482829322053e502bf33e257cda9d376cd10124ab61d1d771f925001
|
Provenance
The following attestation bundles were made for code_firewall_mcp-0.4.0.tar.gz:
Publisher:
release.yml on egoughnour/code-firewall-mcp
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
code_firewall_mcp-0.4.0.tar.gz -
Subject digest:
103260cfaa1ca81305990f6b7dbb654a293ca9e6009f8c36997f4ec4e2bf0890 - Sigstore transparency entry: 834250887
- Sigstore integration time:
-
Permalink:
egoughnour/code-firewall-mcp@7530e9730d1356c5973f3cd235098d6994674056 -
Branch / Tag:
refs/tags/v0.4.0 - Owner: https://github.com/egoughnour
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@7530e9730d1356c5973f3cd235098d6994674056 -
Trigger Event:
push
-
Statement type:
File details
Details for the file code_firewall_mcp-0.4.0-py3-none-any.whl.
File metadata
- Download URL: code_firewall_mcp-0.4.0-py3-none-any.whl
- Upload date:
- Size: 15.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ab8f9f9aa5566cbf73b06f701388e976d5d20fc367f4ce3be60486d7fefea1a2
|
|
| MD5 |
7f2821ed0df7c072a537e43f3945f0cd
|
|
| BLAKE2b-256 |
659616281d2a91e54826405c3ab52a7e1ea99b96a891679ef4a0ca314d6fc43c
|
Provenance
The following attestation bundles were made for code_firewall_mcp-0.4.0-py3-none-any.whl:
Publisher:
release.yml on egoughnour/code-firewall-mcp
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
code_firewall_mcp-0.4.0-py3-none-any.whl -
Subject digest:
ab8f9f9aa5566cbf73b06f701388e976d5d20fc367f4ce3be60486d7fefea1a2 - Sigstore transparency entry: 834250891
- Sigstore integration time:
-
Permalink:
egoughnour/code-firewall-mcp@7530e9730d1356c5973f3cd235098d6994674056 -
Branch / Tag:
refs/tags/v0.4.0 - Owner: https://github.com/egoughnour
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@7530e9730d1356c5973f3cd235098d6994674056 -
Trigger Event:
push
-
Statement type: