Skip to main content

Automated failure pattern detection and root-cause analysis for ML workflows

Project description

etsi-failprint

Automated Diagnostics. Root Cause Analysis. Actionable Insights.

License Python PyPI Status

Don't just measure accuracy. Understand failure.

etsi-failprint is a diagnostic tool designed to answer the question: "Why is my model failing?" It automatically isolates failure patterns across Tabular, NLP, and Computer Vision workflows, generating human-readable reports that pinpoint the root cause of errors.

FeaturesInstallationQuickstartMulti-Modal AnalysisCounterfactuals


Why Failprint?

Standard metrics (Accuracy, F1) tell you how often you fail. Failprint tells you why.

  • Multi-Modal Native: Seamlessly analyze failures in structured DataFrames, raw Text, or Image datasets using a unified API.
  • Automated Segmentation: Automatically discovers weak spots (e.g., "Model fails 80% of the time when Income < 50k").
  • Lazy & Lightweight: Heavy dependencies (Torch, SpaCy, Transformers) are lazy-loaded. If you only analyze tabular data, you never pay the memory cost of deep learning libraries.
  • Robust: Built with graceful degradation. If an optional dependency is missing or incompatible, Failprint adapts instead of crashing.

Key Features

  • Smart Segmentation: Identifies feature ranges or categories where error rates are statistically anomalous.
  • Semantic Clustering:
    • NLP: Groups failed texts by semantic meaning using Sentence Transformers.
    • CV: Clusters failed images using ResNet embeddings to find visual patterns.
  • Meta-Feature Extraction:
    • Text: Analyzes failures by length, sentiment, subjectivity, and NER entities.
    • Vision: Analyzes failures by brightness, contrast, aspect ratio, and dimensions.
  • Counterfactuals: Suggests minimal changes to input data that would flip a failure to a success.
  • Actionable Reporting: Outputs detailed Markdown reports with visual insights directly to your workspace.

Installation

Prerequisites

  • Python (3.8 or later)

From PyPI

pip install etsi-failprint

From Source

git clone https://github.com/etsi-failprint/etsi-failprint.git
cd etsi-failprint
pip install -e .

Quickstart

1. Tabular Data Analysis

Identify which features are driving your model's mistakes.

import pandas as pd
from etsi.failprint import analyze

# Load your data
df = pd.read_csv("loan_predictions.csv")
X = df.drop("target", axis=1)
y_true = df["target"]
y_pred = pd.Series([0, 1, 0, ...]) # Your model's predictions

# Run analysis
report = analyze(
    X, y_true, y_pred,
    cluster=True,       # Cluster similar failures?
    output="markdown"   # Generate 'failprint_report.md'
)

print(report)

Output Insight: "Segment Age < 25 contributes to 40% of all failures."

Multi-Modal Analysis

Failprint isn't just for spreadsheets. It understands unstructured data too.

2. NLP Analysis (Text)

Lazy-loads spacy and sentence-transformers to find semantic and structural failure patterns.

from etsi.failprint import analyze_nlp

texts = [
    "I love this product!", 
    "Terrible service, very slow.", 
    "Product is okay but arrived late."
]
y_true = [1, 0, 0] # Sentiment labels
y_pred = [1, 1, 0] # Model predictions (Error on index 1)

report = analyze_nlp(texts, y_true, y_pred)

Output Insight: "Failures are highly correlated with Sentiment Polarity < -0.5 and Word Count < 5."

3. CV Analysis (Images)

Lazy-loads torch and torchvision to find visual failure clusters (e.g., "Dark images" or "Blurry dogs").

from etsi.failprint import analyze_cv

images = ["img1.jpg", "img2.jpg", "img3.jpg"]
y_true = [0, 1, 0]
y_pred = [0, 0, 0]

analyze_cv(images, y_true, y_pred)

Output Insight: "Cluster 0 (Dark Images) accounts for 60% of false negatives."


Counterfactuals

Go beyond diagnostics. Ask "What should have happened?" This mode suggests the minimal change required to fix a prediction.

from etsi.failprint import analyze

# Run in counterfactual mode
analyze(
    X, y_true, y_pred,
    output="counterfactuals"
)

Example Output:

Original Input: {'Age': 22, 'Income': 35000, 'Education': 'High School'}
Suggested Change: Education to 'Bachelor's'
Prediction: Success (Counterfactual)

Contributing

Pull requests are welcome!

Please refer to CONTRIBUTING.md and CODE_OF_CONDUCT.md before submitting a Pull Request.


Join the Community

Connect with the etsi.ai team and other contributors on our Discord.

Discord


License

This project is distributed under the BSD-2-Clause License. See the LICENSE for details.


Built with ❤️ by etsi.ai

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

etsi_failprint-0.2.0.tar.gz (17.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

etsi_failprint-0.2.0-py3-none-any.whl (17.0 kB view details)

Uploaded Python 3

File details

Details for the file etsi_failprint-0.2.0.tar.gz.

File metadata

  • Download URL: etsi_failprint-0.2.0.tar.gz
  • Upload date:
  • Size: 17.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.0

File hashes

Hashes for etsi_failprint-0.2.0.tar.gz
Algorithm Hash digest
SHA256 ca1b21a79e3ce94d281fffc80007a0ff80e2d968a1d07c803a4121475eac5533
MD5 9673940f0df36f2f8c7618dd3dcc2ec5
BLAKE2b-256 0f598768c24907829b2795ce507fdf375ba77adbef2f6afd42fbab4bb1bb87dc

See more details on using hashes here.

File details

Details for the file etsi_failprint-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: etsi_failprint-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 17.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.0

File hashes

Hashes for etsi_failprint-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 aa2e664a471e0ab919bc1f83f76921321bd00fe45eea465fc24e8aa81b557167
MD5 6e3ad07115df8964486f05a7f47c87fc
BLAKE2b-256 f4b1814ac8c0cfe8a5e2efc2b7ac816dde27a41c8899f3db41dd8ca06c91149f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page