Automated failure pattern detection and root-cause analysis for ML workflows
Project description
etsi-failprint
Automated Diagnostics. Root Cause Analysis. Actionable Insights.
Don't just measure accuracy. Understand failure.
etsi-failprint is a diagnostic tool designed to answer the question: "Why is my model failing?" It automatically isolates failure patterns across Tabular, NLP, and Computer Vision workflows, generating human-readable reports that pinpoint the root cause of errors.
Features • Installation • Quickstart • Multi-Modal Analysis • Counterfactuals
Why Failprint?
Standard metrics (Accuracy, F1) tell you how often you fail. Failprint tells you why.
- Multi-Modal Native: Seamlessly analyze failures in structured DataFrames, raw Text, or Image datasets using a unified API.
- Automated Segmentation: Automatically discovers weak spots (e.g., "Model fails 80% of the time when
Income < 50k"). - Lazy & Lightweight: Heavy dependencies (Torch, SpaCy, Transformers) are lazy-loaded. If you only analyze tabular data, you never pay the memory cost of deep learning libraries.
- Robust: Built with graceful degradation. If an optional dependency is missing or incompatible, Failprint adapts instead of crashing.
Key Features
- Smart Segmentation: Identifies feature ranges or categories where error rates are statistically anomalous.
- Semantic Clustering:
- NLP: Groups failed texts by semantic meaning using Sentence Transformers.
- CV: Clusters failed images using ResNet embeddings to find visual patterns.
- Meta-Feature Extraction:
- Text: Analyzes failures by length, sentiment, subjectivity, and NER entities.
- Vision: Analyzes failures by brightness, contrast, aspect ratio, and dimensions.
- Counterfactuals: Suggests minimal changes to input data that would flip a failure to a success.
- Actionable Reporting: Outputs detailed Markdown reports with visual insights directly to your workspace.
Installation
Prerequisites
- Python (3.8 or later)
From PyPI
pip install etsi-failprint
From Source
git clone https://github.com/etsi-failprint/etsi-failprint.git
cd etsi-failprint
pip install -e .
Quickstart
1. Tabular Data Analysis
Identify which features are driving your model's mistakes.
import pandas as pd
from etsi.failprint import analyze
# Load your data
df = pd.read_csv("loan_predictions.csv")
X = df.drop("target", axis=1)
y_true = df["target"]
y_pred = pd.Series([0, 1, 0, ...]) # Your model's predictions
# Run analysis
report = analyze(
X, y_true, y_pred,
cluster=True, # Cluster similar failures?
output="markdown" # Generate 'failprint_report.md'
)
print(report)
Output Insight: "Segment Age < 25 contributes to 40% of all failures."
Multi-Modal Analysis
Failprint isn't just for spreadsheets. It understands unstructured data too.
2. NLP Analysis (Text)
Lazy-loads spacy and sentence-transformers to find semantic and structural failure patterns.
from etsi.failprint import analyze_nlp
texts = [
"I love this product!",
"Terrible service, very slow.",
"Product is okay but arrived late."
]
y_true = [1, 0, 0] # Sentiment labels
y_pred = [1, 1, 0] # Model predictions (Error on index 1)
report = analyze_nlp(texts, y_true, y_pred)
Output Insight: "Failures are highly correlated with Sentiment Polarity < -0.5 and Word Count < 5."
3. CV Analysis (Images)
Lazy-loads torch and torchvision to find visual failure clusters (e.g., "Dark images" or "Blurry dogs").
from etsi.failprint import analyze_cv
images = ["img1.jpg", "img2.jpg", "img3.jpg"]
y_true = [0, 1, 0]
y_pred = [0, 0, 0]
analyze_cv(images, y_true, y_pred)
Output Insight: "Cluster 0 (Dark Images) accounts for 60% of false negatives."
Counterfactuals
Go beyond diagnostics. Ask "What should have happened?" This mode suggests the minimal change required to fix a prediction.
from etsi.failprint import analyze
# Run in counterfactual mode
analyze(
X, y_true, y_pred,
output="counterfactuals"
)
Example Output:
Original Input: {'Age': 22, 'Income': 35000, 'Education': 'High School'}
Suggested Change: Education to 'Bachelor's'
Prediction: Success (Counterfactual)
Contributing
Pull requests are welcome!
Please refer to CONTRIBUTING.md and CODE_OF_CONDUCT.md before submitting a Pull Request.
Join the Community
Connect with the etsi.ai team and other contributors on our Discord.
License
This project is distributed under the BSD-2-Clause License. See the LICENSE for details.
Built with ❤️ by etsi.ai
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file etsi_failprint-0.2.0.tar.gz.
File metadata
- Download URL: etsi_failprint-0.2.0.tar.gz
- Upload date:
- Size: 17.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ca1b21a79e3ce94d281fffc80007a0ff80e2d968a1d07c803a4121475eac5533
|
|
| MD5 |
9673940f0df36f2f8c7618dd3dcc2ec5
|
|
| BLAKE2b-256 |
0f598768c24907829b2795ce507fdf375ba77adbef2f6afd42fbab4bb1bb87dc
|
File details
Details for the file etsi_failprint-0.2.0-py3-none-any.whl.
File metadata
- Download URL: etsi_failprint-0.2.0-py3-none-any.whl
- Upload date:
- Size: 17.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
aa2e664a471e0ab919bc1f83f76921321bd00fe45eea465fc24e8aa81b557167
|
|
| MD5 |
6e3ad07115df8964486f05a7f47c87fc
|
|
| BLAKE2b-256 |
f4b1814ac8c0cfe8a5e2efc2b7ac816dde27a41c8899f3db41dd8ca06c91149f
|