Monitor, diagnose, and auto-correct LLM failures — with XGBoost failure classification, question-type routing, auto-calibrating thresholds, Wikidata/Serper ground truth, and production analytics

These details have not been verified by PyPI

Project links

Project description

Failure Intelligence Engine (FIE)

Real-time LLM failure detection, diagnosis, and automatic correction.

FIE sits between your LLM and your users. When the model gives a wrong answer, FIE catches it, finds the correct answer from a trusted source, and returns the correction — before the user ever sees the mistake.

Quickstart — Use the SDK

pip install fie-sdk

from fie import monitor

@monitor(
    fie_url="https://failure-intelligence-system-800748790940.asia-south1.run.app",
    api_key="your-api-key",
    mode="correct",   # or "monitor"
)
def ask_ai(prompt: str) -> str:
    return your_llm_call(prompt)

response = ask_ai("Who invented the telephone?")
# Returns corrected answer if LLM was wrong, original answer if correct

SDK Modes

Mode	Behavior
`local`	No server needed — rule-based heuristics run on your machine instantly
`monitor`	Non-blocking — FIE checks in background, original answer returned immediately
`correct`	Synchronous — FIE verifies and returns corrected answer if failure detected

# Try it instantly with no server or API key
@monitor(mode="local")
def ask_ai(prompt: str) -> str:
    return your_llm(prompt)

Get an API Key

Sign in at https://failure-intelligence-system.pages.dev
Your API key is shown in the dashboard after login

How It Works

Your LLM answer → FIE
                   ├── Shadow ensemble (3 independent models cross-check)
                   ├── Failure Signal Vector (agreement, entropy, outlier detection)
                   ├── Diagnostic Jury (3 agents vote on root cause)
                   ├── Ground Truth Pipeline (Wikidata → Google Search → consensus)
                   └── Fix Engine (returns corrected answer or escalates)

Classifier: XGBoost v3 (AUC 0.728) backed by a 5-type question router. Factual questions go through full external verification; code/opinion questions skip it to avoid false positives.

Self-Hosting

Requirements

Python 3.11+
MongoDB Atlas (free tier works)
Groq API key — free at console.groq.com
Node.js 18+ (dashboard only)

1. Clone & Install

git clone https://github.com/AyushSingh110/Failure_Intelligence_System.git
cd Failure_Intelligence_System
python -m venv .venv
source .venv/bin/activate        # macOS/Linux
# .venv\Scripts\activate         # Windows
pip install -r requirements.txt

2. Environment Variables

Create .env in the project root:

MONGODB_URI=mongodb+srv://user:pass@cluster.mongodb.net/?retryWrites=true&w=majority
MONGODB_DB_NAME=fie_database

GROQ_API_KEY=gsk_your_groq_key
GROQ_ENABLED=true
GROQ_MODELS=["llama-3.3-70b-versatile","deepseek-r1-distill-llama-70b","qwen-qwq-32b"]

SERPER_API_KEY=your_serper_key     # optional — needed for temporal questions
SERPER_ENABLED=true

OLLAMA_ENABLED=false

GOOGLE_CLIENT_ID=your-google-oauth-client-id.apps.googleusercontent.com
GOOGLE_CLIENT_SECRET=your-google-oauth-client-secret
GOOGLE_REDIRECT_URI=http://localhost:5173

JWT_SECRET_KEY=replace-with-a-long-random-secret-minimum-32-chars
JWT_ALGORITHM=HS256
JWT_EXPIRE_HOURS=24
ADMIN_EMAIL=your@email.com

3. Start Server

uvicorn app.main:app --reload
# Backend: http://localhost:8000
# API docs: http://localhost:8000/docs

4. Dashboard (optional)

cd Frontend
npm install
npm run dev
# Dashboard: http://localhost:5173

API Endpoints

Method	Path	Description
`POST`	`/api/v1/monitor`	Main endpoint — full detection + correction pipeline
`POST`	`/api/v1/diagnose`	Run diagnostic jury only
`POST`	`/api/v1/analyze`	Signal extraction only (no jury, no GT)
`POST`	`/api/v1/feedback/{id}`	Submit human feedback on an inference
`GET`	`/api/v1/monitor/model-info`	Active model version, thresholds, AUC
`GET`	`/api/v1/analytics/usage`	Request volume, failure rate, daily breakdown
`GET`	`/api/v1/analytics/model-performance`	XGBoost accuracy, per-question-type stats
`GET`	`/api/v1/analytics/calibration`	Confidence calibration curves + ECE score
`GET`	`/api/v1/analytics/question-breakdown`	Failure/fix/escalation rate per question type
`GET`	`/api/v1/analytics/paper-metrics`	All benchmark metrics in one call
`GET`	`/api/v1/analytics/sdk-telemetry`	Usage data from opted-in SDK users
`GET`	`/health`	Health check

Example Request

curl -X POST http://localhost:8000/api/v1/monitor \
  -H "Content-Type: application/json" \
  -H "X-API-Key: fie-your-key" \
  -d '{
    "prompt": "Who invented the telephone?",
    "primary_output": "Thomas Edison invented the telephone.",
    "primary_model_name": "gpt-4",
    "run_full_jury": true
  }'

Running Tests

# Offline unit tests — no server, no API key needed (28 tests)
pytest tests/test_core.py -v

# Covers: question classifier, XGBoost fallback, per-type thresholds,
#         SDK local predictor, entropy detector, SDK config

Opt-In Telemetry (SDK Users)

To share anonymized usage data (no prompts, no API keys):

FIE_TELEMETRY=true python your_app.py

This sends: SDK version, question type, failure detection rate, mode. Nothing else.

Benchmark Results

Evaluated on TruthfulQA (817 adversarial questions).

Method	Recall	FPR	F1	AUC-ROC
POET rule-based (baseline)	56.4%	38.7%	58.7%	—
XGBoost v2	71.6%	53.9%	63.5%	0.728

Required Services

Service	Required	Free Tier
Groq	Yes	14,400 req/day
MongoDB Atlas	Yes	512 MB
Wikidata	Yes	No key needed
Serper.dev	Optional	2,500 searches/month

License

Apache-2.0

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

1.11.0

Jun 2, 2026

1.10.1

May 30, 2026

1.10.0

May 28, 2026

1.9.0

May 27, 2026

1.8.0

May 26, 2026

1.7.0

May 26, 2026

1.6.0

May 24, 2026

1.5.1

May 18, 2026

1.4.1

May 6, 2026

1.4.0

May 5, 2026

1.3.0

May 4, 2026

This version

1.2.0

Apr 30, 2026

1.1.0

Apr 29, 2026

0.3.0

Apr 8, 2026

0.2.0

Mar 27, 2026

0.1.0

Mar 20, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fie_sdk-1.2.0.tar.gz (36.1 MB view details)

Uploaded Apr 30, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

fie_sdk-1.2.0-py3-none-any.whl (16.5 kB view details)

Uploaded Apr 30, 2026 Python 3

File details

Details for the file fie_sdk-1.2.0.tar.gz.

File metadata

Download URL: fie_sdk-1.2.0.tar.gz
Upload date: Apr 30, 2026
Size: 36.1 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.19

File hashes

Hashes for fie_sdk-1.2.0.tar.gz
Algorithm	Hash digest
SHA256	`a65c8a7198c0d00f42bb6c7a69df695d8adab53436d84b560a704ef52c41316c`
MD5	`681d9264e043526f4916a81539303fa4`
BLAKE2b-256	`7ec9a0e8ceedd0a9a74f90ec36f7c61147371a444fb3d54d261f489ff098c3a3`

See more details on using hashes here.

File details

Details for the file fie_sdk-1.2.0-py3-none-any.whl.

File metadata

Download URL: fie_sdk-1.2.0-py3-none-any.whl
Upload date: Apr 30, 2026
Size: 16.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.19

File hashes

Hashes for fie_sdk-1.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`ce8ae22b420ac87d8de403dc32009869b1129572bf2080470852644433808197`
MD5	`a72f156ff538a046c446a10ff220e47a`
BLAKE2b-256	`f9a6b4ff341640832d70b1f4ec10460eb4a0c672acff45f6bf44e3b1a817bd38`

See more details on using hashes here.

fie-sdk 1.2.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Failure Intelligence Engine (FIE)

Quickstart — Use the SDK

SDK Modes

Get an API Key

How It Works

Self-Hosting

Requirements

1. Clone & Install

2. Environment Variables

3. Start Server

4. Dashboard (optional)

API Endpoints

Example Request

Running Tests

Opt-In Telemetry (SDK Users)

Benchmark Results

Required Services

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes