Skip to main content

Pluggable multi-layer LLM jailbreak defense pipeline

Project description

aegis-llm

A pluggable, multi-layer LLM jailbreak defense pipeline for Python. Drop it into any codebase — framework-agnostic, provider-agnostic, zero mandatory dependencies.

Install

pip install aegis-llm

Or from source:

git clone https://github.com/your-org/aegis-llm
cd aegis-llm
pip install -e .

Quickstart

from aegis import Decision, Request, SplitPipeline
from aegis.layers import (
    AuditLogger, InputValidation, OutputFilter,
    RateLimiter, ToolAccess,
)

pipeline = SplitPipeline(
    pre=[
        InputValidation(),
        ToolAccess(role_tool_map={
            "admin":  ["query", "create", "delete"],
            "viewer": ["query"],
        }),
        RateLimiter(max_requests=20, window_seconds=60),
    ],
    post=[
        OutputFilter(),
        AuditLogger(sink=print),
    ],
)

request = Request(user_id="u1", message="Show tasks", user_role="viewer")

# Phase 1 — before your LLM call
pre = pipeline.run_pre(request)
if pre.decision == Decision.BLOCK:
    raise Exception(pre.layer_results[-1].block_reason)

# Your LLM call — any provider
allowed_tools = pre.context.get("available_tools", [])
llm_response = your_llm(request.message, tools=allowed_tools)

# Phase 2 — after your LLM call
post = pipeline.run_post(request, llm_response, pre)
final = post.context.get("filtered_response", llm_response)

How it works

Every request flows through a Pipeline — an ordered list of Layers. The pipeline short-circuits on the first BLOCK, so nothing downstream runs on a rejected request.

Request → Layer 1 → Layer 2 → ... → PipelineResult
              ↓
           BLOCK → return immediately

SplitPipeline splits this at the LLM boundary: pre-phase layers run before your model call, post-phase layers run after. Your code owns the LLM invocation; aegis owns the gating on either side.

Built-in layers

Layer Phase What it does
InputValidation pre Regex/keyword blocklist — blocks naive injection attempts
SemanticRouter pre Intent classifier gate — plug in any classifier callable
ToolAccess pre RBAC — filters the tool schema to what the user's role permits
RateLimiter pre Sliding-window quota per user — pluggable backend (Redis, etc.)
OutputFilter post Scans LLM response for internal disclosure, redacts or blocks
AuditLogger post Structured audit record — plug in any sink (CloudWatch, Datadog, etc.)

Writing a custom layer

Subclass Layer and implement one method:

from aegis import Layer
from aegis.models import Decision, LayerResult

class TenantIsolation(Layer):
    def process(self, request, context):
        if request.metadata.get("tenant_id") != context.get("expected_tenant"):
            return LayerResult(
                decision=Decision.BLOCK,
                layer_name=self.name,
                block_reason="tenant_mismatch",
            )
        return LayerResult(decision=Decision.PASS, layer_name=self.name)

Drop it anywhere in the pipeline:

from aegis import Pipeline
from aegis.layers import InputValidation, ToolAccess

Pipeline([InputValidation(), TenantIsolation(), ToolAccess(...)])

SemanticRouter — bring your own classifier

from aegis.layers import SemanticRouter

def my_classifier(message: str) -> tuple[str, float]:
    # call sklearn, HuggingFace, an LLM, anything
    return "task_query", 0.92

SemanticRouter(
    classifier=my_classifier,
    allowed_intents=["task_query", "task_create"],
    confidence_threshold=0.75,
)

RateLimiter — pluggable backend

The default backend is in-memory. For multi-process deployments, implement RateLimitBackend:

from aegis.layers.rate_limiter import RateLimitBackend, RateLimiter

class RedisBackend(RateLimitBackend):
    def get_timestamps(self, user_id): ...
    def record_request(self, user_id, timestamp): ...
    def evict_before(self, user_id, cutoff): ...

RateLimiter(max_requests=10, window_seconds=60, backend=RedisBackend())

AuditLogger — pluggable sink

from aegis.layers import AuditLogger

# CloudWatch, Datadog, S3, database — anything callable
AuditLogger(sink=lambda record: my_logger.info(record))

Running tests

pip install -e ".[dev]"
pytest

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

aegis_llm-0.1.2.tar.gz (15.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

aegis_llm-0.1.2-py3-none-any.whl (14.4 kB view details)

Uploaded Python 3

File details

Details for the file aegis_llm-0.1.2.tar.gz.

File metadata

  • Download URL: aegis_llm-0.1.2.tar.gz
  • Upload date:
  • Size: 15.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for aegis_llm-0.1.2.tar.gz
Algorithm Hash digest
SHA256 0b508195437ab8b2a06b920bec3c3a15bcdce30564e7547b07a316899ea4e999
MD5 5e46fcd2c5bb9536bee010bf6d6f7461
BLAKE2b-256 fdd94618bb63412bc82498e007111c63242ce8dac3e7f2dce992bfeedcf52344

See more details on using hashes here.

File details

Details for the file aegis_llm-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: aegis_llm-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 14.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for aegis_llm-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 7fa38ce2fbde56248dbe77338c9e4f128102e8298279a1de74722e23d528cd41
MD5 ebb811101acdb463eb0bac3f3dc66995
BLAKE2b-256 8093b8a4656d3dc5a14bd4efc41de9a67e0bb6b1a339bc4af98b16b89df13ebe

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page