Pluggable multi-layer LLM jailbreak defense pipeline
Project description
aegis-llm
A pluggable, multi-layer LLM jailbreak defense pipeline for Python. Drop it into any codebase — framework-agnostic, provider-agnostic, zero mandatory dependencies.
Install
pip install aegis-llm
Or from source:
git clone https://github.com/your-org/aegis-llm
cd aegis-llm
pip install -e .
Quickstart
from aegis import Decision, Request, SplitPipeline
from aegis.layers import (
AuditLogger, InputValidation, OutputFilter,
RateLimiter, ToolAccess,
)
pipeline = SplitPipeline(
pre=[
InputValidation(),
ToolAccess(role_tool_map={
"admin": ["query", "create", "delete"],
"viewer": ["query"],
}),
RateLimiter(max_requests=20, window_seconds=60),
],
post=[
OutputFilter(),
AuditLogger(sink=print),
],
)
request = Request(user_id="u1", message="Show tasks", user_role="viewer")
# Phase 1 — before your LLM call
pre = pipeline.run_pre(request)
if pre.decision == Decision.BLOCK:
raise Exception(pre.layer_results[-1].block_reason)
# Your LLM call — any provider
allowed_tools = pre.context.get("available_tools", [])
llm_response = your_llm(request.message, tools=allowed_tools)
# Phase 2 — after your LLM call
post = pipeline.run_post(request, llm_response, pre)
final = post.context.get("filtered_response", llm_response)
How it works
Every request flows through a Pipeline — an ordered list of Layers. The pipeline short-circuits on the first BLOCK, so nothing downstream runs on a rejected request.
Request → Layer 1 → Layer 2 → ... → PipelineResult
↓
BLOCK → return immediately
SplitPipeline splits this at the LLM boundary: pre-phase layers run before your model call, post-phase layers run after. Your code owns the LLM invocation; aegis owns the gating on either side.
Built-in layers
| Layer | Phase | What it does |
|---|---|---|
InputValidation |
pre | Regex/keyword blocklist — blocks naive injection attempts |
SemanticRouter |
pre | Intent classifier gate — plug in any classifier callable |
ToolAccess |
pre | RBAC — filters the tool schema to what the user's role permits |
RateLimiter |
pre | Sliding-window quota per user — pluggable backend (Redis, etc.) |
OutputFilter |
post | Scans LLM response for internal disclosure, redacts or blocks |
AuditLogger |
post | Structured audit record — plug in any sink (CloudWatch, Datadog, etc.) |
Writing a custom layer
Subclass Layer and implement one method:
from aegis import Layer
from aegis.models import Decision, LayerResult
class TenantIsolation(Layer):
def process(self, request, context):
if request.metadata.get("tenant_id") != context.get("expected_tenant"):
return LayerResult(
decision=Decision.BLOCK,
layer_name=self.name,
block_reason="tenant_mismatch",
)
return LayerResult(decision=Decision.PASS, layer_name=self.name)
Drop it anywhere in the pipeline:
from aegis import Pipeline
from aegis.layers import InputValidation, ToolAccess
Pipeline([InputValidation(), TenantIsolation(), ToolAccess(...)])
SemanticRouter — bring your own classifier
from aegis.layers import SemanticRouter
def my_classifier(message: str) -> tuple[str, float]:
# call sklearn, HuggingFace, an LLM, anything
return "task_query", 0.92
SemanticRouter(
classifier=my_classifier,
allowed_intents=["task_query", "task_create"],
confidence_threshold=0.75,
)
RateLimiter — pluggable backend
The default backend is in-memory. For multi-process deployments, implement RateLimitBackend:
from aegis.layers.rate_limiter import RateLimitBackend, RateLimiter
class RedisBackend(RateLimitBackend):
def get_timestamps(self, user_id): ...
def record_request(self, user_id, timestamp): ...
def evict_before(self, user_id, cutoff): ...
RateLimiter(max_requests=10, window_seconds=60, backend=RedisBackend())
AuditLogger — pluggable sink
from aegis.layers import AuditLogger
# CloudWatch, Datadog, S3, database — anything callable
AuditLogger(sink=lambda record: my_logger.info(record))
Running tests
pip install -e ".[dev]"
pytest
License
MIT
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file aegis_llm-0.1.1.tar.gz.
File metadata
- Download URL: aegis_llm-0.1.1.tar.gz
- Upload date:
- Size: 15.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1e74fc494c6b1442b22aad35ab2745ba9814d97948509789f9927a23b971621a
|
|
| MD5 |
c8877acf2095c4e4872595a38ce48c85
|
|
| BLAKE2b-256 |
4da72906f685624f7ea2d76a2e137a7a95c1307302c9d9cfa20a42ae6321be5f
|
File details
Details for the file aegis_llm-0.1.1-py3-none-any.whl.
File metadata
- Download URL: aegis_llm-0.1.1-py3-none-any.whl
- Upload date:
- Size: 14.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f2c8ad77fd5b8f0bfca1306efabb72a4e3eca81251e741516d4aec9093497f4e
|
|
| MD5 |
1822771aee674c026e7faa87d1958641
|
|
| BLAKE2b-256 |
592edca8c32f65d880e70dc94e07b6be2aee63ebd28448542433d1fa679a5084
|