Skip to main content

Mathematical health monitor for LLMs — detect degradation and hallucination without NLP

Project description

LLM EKG

Is your AI getting dumber? Now you can prove it.

DOI License Python

Quick StartLive MonitoringReportHow It WorksPaper


LLM EKG is a mathematical health monitor for Large Language Models. It analyzes LLM outputs as time series to detect degradation, hallucination, and behavioral drift — using pure mathematics, not NLP.

No embeddings. No tokenizers. No external AI. Just numpy.

RESULT: DEGRADED (74/100)
Trend: +0.1651
Hallucination risk: 22.96%
Mean persistence: 0.578

Why?

Every company runs LLMs in production. Nobody monitors their output quality mathematically.

  • GPT-4 getting lazier over time? LLM EKG detects it.
  • Claude hallucinating more after an update? LLM EKG catches it.
  • Your fine-tuned model degrading silently? LLM EKG raises the alarm.

The big labs will never build this — it exposes their problems. So we did.

Quick Start

pip install llm-ekg

One command

# Auto-detects format (ChatGPT, Claude, CSV, JSONL, plain text)
llm-ekg conversation.json

# Explicit format
llm-ekg --format chatgpt export.json -o report.html

Three lines of Python

from llm_ekg import LLMAnalyzer

analyzer = LLMAnalyzer()
for response in my_responses:
    result = analyzer.ingest(response["text"])

print(f"{analyzer.get_summary()['verdict']}{analyzer.get_summary()['global_score_100']}/100")

Live Monitoring

Wrap your OpenAI or Anthropic client. Zero code changes.

from llm_ekg import LiveMonitor

monitor = LiveMonitor()

# OpenAI
import openai
client = monitor.wrap_openai(openai.OpenAI())

# Use exactly as before — monitoring is automatic
response = client.chat.completions.create(
    model="gpt-4",
    messages=[{"role": "user", "content": "Explain quantum computing"}]
)

# Check health anytime
print(f"Score: {monitor.score}/100 — {monitor.verdict}")

# Generate full HTML report
monitor.report("ekg.html")

Works with Anthropic too:

import anthropic
client = monitor.wrap_anthropic(anthropic.Anthropic())

What It Detects

Signal Meaning
Anomaly rising Model quality is degrading
Drift spike Sudden behavioral shift
Confidence mismatch Hallucination (specific claims + zero hedging)
Assertion density up Model becoming overconfident
Persistence > 0.5 Degradation is trending, not random
Persistence < 0.5 Model self-correcting

How It Works

LLM EKG extracts 16 numerical features from each response — no NLP, no language models, no semantic analysis:

Degradation signals (0-11): response length, word count, vocabulary diversity, word length, sentence count, sentence length, punctuation density, hedge ratio, list usage, code ratio, repetition score, latency.

Hallucination signature (12-15): specificity score (concrete details density), confidence mismatch (specificity vs hedging gap), assertion density (certainty vs uncertainty ratio), self-consistency (internal contradiction score).

These features feed into a proprietary behavioral state engine that computes anomaly scores, drift magnitude, and multi-scale persistence analysis.

All diagnostics are data-driven — zero hardcoded thresholds. Every metric is compared against its own distribution within the session.

HTML Report

Self-contained HTML file. No JavaScript dependencies. Opens in any browser.

9 sections: Executive Summary, Hallucination Monitor, EKG Temporal, Behavioral Metrics (M0-M3), Drift Map, Multi-Scale Analysis, Trend Persistence, Feature Timeline, Diagnostic.

Run the demo

git clone https://github.com/iafiscal1212/llm-ekg.git
cd llm-ekg
pip install -e .
python demo.py
# Open demo_ekg_report.html

Supported Formats

Format Extension Source
ChatGPT .json Settings → Export data
Claude .json claude.ai export
API Log .csv CSV with response column
JSONL .jsonl One JSON per line
Plain Text .txt Blank-line separated

Dependencies

numpy + matplotlib. That's it.

Optional: openai and/or anthropic for live monitoring.

pip install llm-ekg[openai]     # OpenAI wrapper
pip install llm-ekg[anthropic]  # Anthropic wrapper
pip install llm-ekg[all]        # Both

License

Business Source License 1.1 — the same license used by Redis, MariaDB, and Sentry.

  • Always free: personal use, internal monitoring, research, education, open source
  • Commercial license required: if you sell LLM monitoring as a service
  • Converts to Apache 2.0: March 28, 2030

Contact: carmen@iafiscal.es

Cite

If you use LLM EKG in your research, please cite:

@software{esteban2026llmekg,
  author    = {Esteban, Carmen},
  title     = {LLM EKG: A Mathematical Health Monitor for Large Language Models},
  year      = {2026},
  publisher = {Zenodo},
  doi       = {10.5281/zenodo.19284461},
  url       = {https://doi.org/10.5281/zenodo.19284461}
}

Author

Carmen EstebanIAFISCAL & PARTNERS

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llm_ekg-1.0.0.tar.gz (24.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

llm_ekg-1.0.0-py3-none-any.whl (23.2 kB view details)

Uploaded Python 3

File details

Details for the file llm_ekg-1.0.0.tar.gz.

File metadata

  • Download URL: llm_ekg-1.0.0.tar.gz
  • Upload date:
  • Size: 24.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for llm_ekg-1.0.0.tar.gz
Algorithm Hash digest
SHA256 590a82e6a3719bbe700d8054aa0197b7d572812d79a8f10e3a7674e3c70d7309
MD5 642a0d3a4885e831f3e3b7b83d3dbb9d
BLAKE2b-256 bb927e98ca4ade581747a51a304dc08bbb26f90adae50297d2cea27b249ad17d

See more details on using hashes here.

File details

Details for the file llm_ekg-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: llm_ekg-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 23.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for llm_ekg-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 fd16425f88a96a803b5f38fc3580df189bc594dd53bc1da6c2e9bb9c2558705d
MD5 75e9dd5dd9c11d3a7aafcf8059dd8ba5
BLAKE2b-256 37a3de3481d99e1c3c965e6f50f43dd0391ea88c32b030ad26e74cf2901aad90

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page