Shared probe and scorer implementations for LLM degradation detection
Project description
NerfProbe Core
Scientifically-grounded LLM degradation detection.
nerfprobe-core provides the essential detection logic, scorers, and probe definitions used by the NerfProbe CLI and NerfStatus. It is designed for developers who need rigorous, research-backed instruments to measure model quality, consistency, and alignment.
Installation
pip install nerfprobe-core
Features
- 17 Research-Backed Probes: Detection instruments grounded in specific academic papers on model collapse and quantization artifacts.
- 3-Tier Architecture: Organized into Core (Essential), Advanced (Structural), and Optional (Experimental) tiers.
- Universal Scoring: Reusable scorers (JSON schema validation, TTR, Fact verification) independent of the probe execution.
- Type-Safe: Fully typed with Python 3.11+, leveraging Pydantic for configuration and results.
Probes
Core Tier (Essential Signals)
| Probe | Detection Target | Paper |
|---|---|---|
| MathProbe | Arithmetic reasoning degradation | 2504.04823 |
| StyleProbe | Vocabulary collapse (Type-Token Ratio) | 2403.06408 |
| TimingProbe | Latency fingerprinting & TTFT degradation | 2502.20589 |
| CodeProbe | Syntax collapse in generated code | 2512.08213 |
| FactProbe | Factual recall and hallucination checks | N/A |
Advanced Tier (Structural Integrity)
| Probe | Detection Target | Paper |
|---|---|---|
| JsonProbe | JSON schema adherence & structure | 2402.16775 |
| ConsistencyProbe | Fact permanence & self-contradiction | 2504.04823 |
| FingerprintProbe | Underlying framework/model identity detection | 2407.15847 |
| ContextProbe | Key-Value cache compression artifacts | 2512.12008 |
| RoutingProbe | MoE routing path detection | 2406.18665 |
| RepetitionProbe | Loop detection & phrase repetition | 2403.06408 |
| ConstraintProbe | Negative constraint adherence | 2409.11055 |
| LogicProbe | Reasoning step validity | 2504.04823 |
| ChainOfThoughtProbe | CoT step integrity | 2504.04823 |
Optional Tier (Experimental)
| Probe | Detection Target | Paper |
|---|---|---|
| CalibrationProbe | Confidence score calibration | 2511.07585 |
| ZeroPrintProbe | Mode collapse via entropy measurement | 2407.01235 |
| MultilingualProbe | Cross-language performance asymmetry | EMNLP.935 |
Usage
Basic Probe Execution
import asyncio
from nerfprobe_core import ModelTarget
from nerfprobe_core.probes import MathProbe
from nerfprobe_core.probes.config import MathProbeConfig
# 1. Configure the probe
config = MathProbeConfig(
prompt="Calculate 15 * 12 + 8.",
expected_answer="188",
)
# 2. Define the target
target = ModelTarget(provider_id="openai", model_name="gpt-4o")
# 3. Instantiate and run (requires an LLM Gateway)
probe = MathProbe(config)
# result = await probe.run(target, gateway)
# print(result.summary())
# > math_probe: PASS (1.00) in 234ms
Using Scorers Directly
You can use the scoring logic without the full probe infrastructure:
from nerfprobe_core.scorers import JsonScorer
scorer = JsonScorer(strict=True)
valid_json = '{"name": "NerfProbe"}'
invalid_json = '```json{"name": "NerfProbe"}```'
score, metadata = scorer.score(valid_json)
print(f"Score: {score}") # 1.0
score, metadata = scorer.score(invalid_json)
print(f"Score: {score}") # 0.0 (Strict mode rejects markdown blocks)
Contributing
We welcome contributions! Please see CONTRIBUTING.md for details on how to set up the development environment, run tests, and submit PRs.
License
Apache-2.0
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file nerfprobe_core-0.2.1.tar.gz.
File metadata
- Download URL: nerfprobe_core-0.2.1.tar.gz
- Upload date:
- Size: 85.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
505bc553bf2d958b590a34ec71641158073dee9772e11c2b76c63edfe5de5f08
|
|
| MD5 |
2ff6f46db14474cee04428a61c17ab21
|
|
| BLAKE2b-256 |
cca6ee9ce1a2805f8282f48eb9e6acee2f9fa0ff3f1e7753c4a8acc9c723b5fc
|
Provenance
The following attestation bundles were made for nerfprobe_core-0.2.1.tar.gz:
Publisher:
release.yml on nerfstatus/nerfprobe-core
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
nerfprobe_core-0.2.1.tar.gz -
Subject digest:
505bc553bf2d958b590a34ec71641158073dee9772e11c2b76c63edfe5de5f08 - Sigstore transparency entry: 788039933
- Sigstore integration time:
-
Permalink:
nerfstatus/nerfprobe-core@2500cb15dc6f2818691e984e6a96b57b56bc316c -
Branch / Tag:
refs/tags/v0.2.1 - Owner: https://github.com/nerfstatus
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@2500cb15dc6f2818691e984e6a96b57b56bc316c -
Trigger Event:
push
-
Statement type:
File details
Details for the file nerfprobe_core-0.2.1-py3-none-any.whl.
File metadata
- Download URL: nerfprobe_core-0.2.1-py3-none-any.whl
- Upload date:
- Size: 57.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5fe2ef1de4e07241d4ec47cf063d67b2684e6ee03682220d180123fc92ca074e
|
|
| MD5 |
860c4f8fa110fa6fe41370356c3f4f44
|
|
| BLAKE2b-256 |
433a7c04d7d3c748caf5921bc7ff4af130b23ad16f96bcf774d60407c57a4f3b
|
Provenance
The following attestation bundles were made for nerfprobe_core-0.2.1-py3-none-any.whl:
Publisher:
release.yml on nerfstatus/nerfprobe-core
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
nerfprobe_core-0.2.1-py3-none-any.whl -
Subject digest:
5fe2ef1de4e07241d4ec47cf063d67b2684e6ee03682220d180123fc92ca074e - Sigstore transparency entry: 788039936
- Sigstore integration time:
-
Permalink:
nerfstatus/nerfprobe-core@2500cb15dc6f2818691e984e6a96b57b56bc316c -
Branch / Tag:
refs/tags/v0.2.1 - Owner: https://github.com/nerfstatus
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@2500cb15dc6f2818691e984e6a96b57b56bc316c -
Trigger Event:
push
-
Statement type: