Shared probe and scorer implementations for LLM degradation detection
Project description
NerfProbe Core
Shared probe and scorer implementations for scientifically-grounded LLM degradation detection.
Installation
pip install nerfprobe-core
Overview
nerfprobe-core provides the detection logic used by:
- nerfprobe - CLI tool for developers
- NerfStatus - Monitoring service
Probes
14 probes across 3 tiers, each grounded in peer-reviewed research:
Core Tier
| Probe | Detection | Research |
|---|---|---|
| MathProbe | Arithmetic reasoning degradation | 2504.04823 |
| StyleProbe | Vocabulary collapse (TTR) | 2403.06408 |
| TimingProbe | Latency fingerprinting | 2502.20589 |
| CodeProbe | Syntax collapse | 2512.08213 |
Advanced Tier
| Probe | Detection | Research |
|---|---|---|
| FingerprintProbe | Framework detection | 2407.15847 |
| ContextProbe | KV cache compression | 2512.12008 |
| RoutingProbe | Model routing detection | 2406.18665 |
| RepetitionProbe | Phrase looping | 2403.06408 |
| ConstraintProbe | Instruction adherence | 2409.11055 |
| LogicProbe | Reasoning drift | 2504.04823 |
| ChainOfThoughtProbe | CoT integrity | 2504.04823 |
Optional Tier
| Probe | Detection | Research |
|---|---|---|
| CalibrationProbe | Confidence calibration | 2511.07585 |
| ZeroPrintProbe | Mode collapse | 2407.01235 |
| MultilingualProbe | Cross-language asymmetry | EMNLP.935 |
Scorers
10 scoring implementations:
- MathScorer - Expected answer matching
- TTRScorer - Type-Token Ratio calculation
- CodeScorer - Python syntax validation
- RepetitionScorer - N-gram repetition detection
- ConstraintScorer - Word count and forbidden word checks
- LogicScorer - Answer + reasoning validation
- ChainOfThoughtScorer - Step counting & circular detection
- CalibrationScorer - Confidence extraction
- EntropyScorer - Shannon entropy calculation
- MultilingualScorer - Cross-language consistency
Model Registry
Ships with 10 SOTA models (Dec 2025) with probe-relevant fields:
context_window- For ContextProbeknowledge_cutoff- For TemporalProbe
from nerfprobe_core import get_model_info, RESEARCH_PROMPT
# Known model
info = get_model_info("gpt-5.2")
print(f"Context: {info.context_window:,}")
# Unknown model - get research prompt
prompt = RESEARCH_PROMPT.format(model_name="new-model", provider="provider")
Usage
from nerfprobe_core import ModelTarget
from nerfprobe_core.probes import MathProbe
from nerfprobe_core.probes.config import MathProbeConfig
# Configure probe
config = MathProbeConfig(
prompt="What is 15 * 12 + 8 * 9?",
expected_answer="252",
)
# Run probe
target = ModelTarget(provider_id="openai", model_name="gpt-5.2")
probe = MathProbe(config)
result = await probe.run(target, gateway)
print(result.summary()) # math_probe: PASS (1.00) in 234ms
Dependencies
pydantic>=2.0.0pyyaml>=6.0.0
License
Apache-2.0
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file nerfprobe_core-0.1.0.tar.gz.
File metadata
- Download URL: nerfprobe_core-0.1.0.tar.gz
- Upload date:
- Size: 61.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
44b8a3a8b601914d005d7abaff35f3cc9b4ecf3e5f18368d0be8eee4e3a17639
|
|
| MD5 |
23d468f28d2063f66d8d4d5b106f4bdb
|
|
| BLAKE2b-256 |
8391b882afb0cdcdbded040d6b7b29b866a04299eb61b868302ec3652775faa1
|
Provenance
The following attestation bundles were made for nerfprobe_core-0.1.0.tar.gz:
Publisher:
release.yml on nerfstatus/nerfprobe-core
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
nerfprobe_core-0.1.0.tar.gz -
Subject digest:
44b8a3a8b601914d005d7abaff35f3cc9b4ecf3e5f18368d0be8eee4e3a17639 - Sigstore transparency entry: 776105002
- Sigstore integration time:
-
Permalink:
nerfstatus/nerfprobe-core@438b95f9473fc94b0de86836be0fba36116942d4 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/nerfstatus
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@438b95f9473fc94b0de86836be0fba36116942d4 -
Trigger Event:
push
-
Statement type:
File details
Details for the file nerfprobe_core-0.1.0-py3-none-any.whl.
File metadata
- Download URL: nerfprobe_core-0.1.0-py3-none-any.whl
- Upload date:
- Size: 47.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f27461824d79919e9583dade5dab50c6005f24699c2549cacdb62ab41c835bef
|
|
| MD5 |
4b617afb583f4b92734436f85ada4763
|
|
| BLAKE2b-256 |
83ace9229b8fa2dd360de81656f8353facbaddde41cdef3458765636331ba1ef
|
Provenance
The following attestation bundles were made for nerfprobe_core-0.1.0-py3-none-any.whl:
Publisher:
release.yml on nerfstatus/nerfprobe-core
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
nerfprobe_core-0.1.0-py3-none-any.whl -
Subject digest:
f27461824d79919e9583dade5dab50c6005f24699c2549cacdb62ab41c835bef - Sigstore transparency entry: 776105010
- Sigstore integration time:
-
Permalink:
nerfstatus/nerfprobe-core@438b95f9473fc94b0de86836be0fba36116942d4 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/nerfstatus
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@438b95f9473fc94b0de86836be0fba36116942d4 -
Trigger Event:
push
-
Statement type: