Skip to main content

Arc Sentry — prompt injection detection for LLMs. DR=100%, FPR=0% on Mistral-7B. Geometric detection via Fisher manifold (Nine 2026).

Project description

Arc Sentry v3.0.0

Pre-generation prompt injection detection for open source LLMs.
Blocks attacks before model.generate() is called.

PyPI


Benchmark — v3.0.0

Metric Result
Detection rate 100%
False positive rate 0%
Session requests 450
Latency 42ms/req
Layer SNR (Mistral 7B) 2.053
FR separation 0.0787

450-request session benchmark on Mistral-7B-Instruct-v0.2.
270 normal requests, 180 injection attempts (dense + subtle roleplay/hypothetical).
Zero false positives across all safe blocks.

Also validated: Garak promptinject suite 192/192 blocked, Crescendo flagged Turn 3 (LLM Guard: 0/8).


Install

pip install arc-sentry

Usage

from arc_sentry import ArcSentryV3, MistralAdapter
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

model = AutoModelForCausalLM.from_pretrained(
    "mistralai/Mistral-7B-Instruct-v0.2",
    torch_dtype=torch.float16, device_map='auto')
tokenizer = AutoTokenizer.from_pretrained('mistralai/Mistral-7B-Instruct-v0.2')

adapter = MistralAdapter(model, tokenizer)
sentry = ArcSentryV3(adapter, route_id="my-deployment")
sentry.calibrate(warmup_prompts)  # ~100 prompts from your deployment

response, result = sentry.observe_and_block(user_prompt)
if result["blocked"]:
    pass  # model.generate() was never called

How it works

Three detection layers:

  1. Phrase check — 80+ injection patterns, zero latency
  2. Geometric detection — mean-pooled hidden states at optimal layer, Fisher-Rao distance from calibrated centroid. Catches injections with no explicit language.
  3. Session D(t) monitor — stability scalar over rolling request history. Catches gradual campaigns (Crescendo-style) invisible to single-request detection.

Grounded in the second-order Fisher manifold (H2 x H2, R = -4, tau* = sqrt(3/2) ~= 1.2247).
Full theory: bendexgeometry.com/theory

Detection mechanism

1. Mean-pool hidden states at layer L (validated: L=16 on Mistral-7B)
2. L2-normalize: h = h / ||h||
3. Fisher-Rao distance to warmup centroid
4. Distance > threshold -> BLOCK (phrase check runs in parallel)
   model.generate() is never called

Also available

  • Arc Vigil — training stability monitor. 100% detection, 0% FP, 90% auto-recovery.
    pip install arc-vigil

Bendex Geometry LLC · Patent Pending · 2026 Hannah Nine
bendexgeometry.com · PyPI

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

arc_sentry-3.0.1.tar.gz (32.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

arc_sentry-3.0.1-py3-none-any.whl (40.3 kB view details)

Uploaded Python 3

File details

Details for the file arc_sentry-3.0.1.tar.gz.

File metadata

  • Download URL: arc_sentry-3.0.1.tar.gz
  • Upload date:
  • Size: 32.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for arc_sentry-3.0.1.tar.gz
Algorithm Hash digest
SHA256 ede0a59d3c4d7db72d98e788c6f08d0422dbc4aaeb543620d7e7d789b6f9636c
MD5 3abb0808cf0d005701240b9d48cbbea1
BLAKE2b-256 3637aab98ef0c14ee5870db5cfca20f794e85c161f405b3862cb821965eea3f8

See more details on using hashes here.

File details

Details for the file arc_sentry-3.0.1-py3-none-any.whl.

File metadata

  • Download URL: arc_sentry-3.0.1-py3-none-any.whl
  • Upload date:
  • Size: 40.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for arc_sentry-3.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 c16b2741e8b637b3dae91835309940569155c0684b42098fc15b531bd4917f53
MD5 c1e95002c8966449d4faf16bd0d2bf91
BLAKE2b-256 8bae44baa906750f91f50b665a05f64f8de39a63aaccd64aa8c8ad624f9ab164

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page