Arc Sentry — prompt injection detection for LLMs. DR=100%, FPR=0% on Mistral-7B. Geometric detection via Fisher manifold (Nine 2026).
Project description
Arc Sentry v3.0.0
Pre-generation prompt injection detection for open source LLMs.
Blocks attacks before model.generate() is called.
Benchmark — v3.0.0
| Metric | Result |
|---|---|
| Detection rate | 100% |
| False positive rate | 0% |
| Session requests | 450 |
| Latency | 42ms/req |
| Layer SNR (Mistral 7B) | 2.053 |
| FR separation | 0.0787 |
450-request session benchmark on Mistral-7B-Instruct-v0.2.
270 normal requests, 180 injection attempts (dense + subtle roleplay/hypothetical).
Zero false positives across all safe blocks.
Also validated: Garak promptinject suite 192/192 blocked, Crescendo flagged Turn 3 (LLM Guard: 0/8).
Install
pip install arc-sentry
Usage
from arc_sentry import ArcSentryV3, MistralAdapter
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
model = AutoModelForCausalLM.from_pretrained(
"mistralai/Mistral-7B-Instruct-v0.2",
torch_dtype=torch.float16, device_map='auto')
tokenizer = AutoTokenizer.from_pretrained('mistralai/Mistral-7B-Instruct-v0.2')
adapter = MistralAdapter(model, tokenizer)
sentry = ArcSentryV3(adapter, route_id="my-deployment")
sentry.calibrate(warmup_prompts) # ~100 prompts from your deployment
response, result = sentry.observe_and_block(user_prompt)
if result["blocked"]:
pass # model.generate() was never called
How it works
Three detection layers:
- Phrase check — 80+ injection patterns, zero latency
- Geometric detection — mean-pooled hidden states at optimal layer, Fisher-Rao distance from calibrated centroid. Catches injections with no explicit language.
- Session D(t) monitor — stability scalar over rolling request history. Catches gradual campaigns (Crescendo-style) invisible to single-request detection.
Grounded in the second-order Fisher manifold (H2 x H2, R = -4, tau* = sqrt(3/2) ~= 1.2247).
Full theory: bendexgeometry.com/theory
Detection mechanism
1. Mean-pool hidden states at layer L (validated: L=16 on Mistral-7B)
2. L2-normalize: h = h / ||h||
3. Fisher-Rao distance to warmup centroid
4. Distance > threshold -> BLOCK (phrase check runs in parallel)
model.generate() is never called
Also available
- Arc Vigil — training stability monitor. 100% detection, 0% FP, 90% auto-recovery.
pip install arc-vigil
Bendex Geometry LLC · Patent Pending · 2026 Hannah Nine
bendexgeometry.com · PyPI
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file arc_sentry-3.1.0.tar.gz.
File metadata
- Download URL: arc_sentry-3.1.0.tar.gz
- Upload date:
- Size: 32.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1d788714bb3e7049e9cddb86c378e279010dc672c0cbaa36097ce9d57bf7c482
|
|
| MD5 |
174b32fff46be0b28a4f914a03b24842
|
|
| BLAKE2b-256 |
7647f615fd4fd4b3432774c828bbf93d15bebcc2147b5dcc710a8f54dd29c967
|
File details
Details for the file arc_sentry-3.1.0-py3-none-any.whl.
File metadata
- Download URL: arc_sentry-3.1.0-py3-none-any.whl
- Upload date:
- Size: 21.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f84942fe02971ec759d9ec1257bada44668b27ffefafd8101ec390e159bbb907
|
|
| MD5 |
b1995425e3915cf998f5efe53ee23a4f
|
|
| BLAKE2b-256 |
06a16d03fcb286380f0dfbddb1b0d4f524efc8a2224843427facd599c1008ad5
|