AI-Powered Root Cause Analysis for pytest — Dual-Agent (Analyzer→Critic) pipeline that automatically triages test failures
Project description
Failscope
AI-Powered Root Cause Analysis for pytest
Failscope is a zero-config pytest plugin that automatically triages test failures using a Dual-Agent AI pipeline. It deduplicates failures by fingerprint, runs parallel LLM analysis, and generates an HTML report your team can share — not just raw logs.
Features
- Dual-Agent RCA — Analyzer (creative, temp 0.4) → Critic (deterministic, temp 0.0) prevents hallucinations by cross-checking every claim against raw evidence
- Parallel async analysis — all unique failures analysed concurrently; no serial API blocking in CI
- Error fingerprinting — clusters identical failures, LLM sees only unique root causes
- PII & secrets sanitization — API keys, passwords, JWTs, and tokens are redacted before leaving your machine
- HTML report — self-contained single-file report, shareable in Slack or email
- A–F stability scoring — flakiness detection and trend analysis across the last 20 runs
- Local LLM support — run fully offline with Ollama (zero API cost, full data privacy)
- Multi-provider — Groq (free tier), OpenAI, Anthropic, or any Ollama model
- Offline fallback — rule-based analysis when no API key is available
Quick Start
pip install failscope
Cloud LLM (recommended for best results)
export GROQ_API_KEY=your-key # free at console.groq.com
pytest --failscope
Local LLM via Ollama (zero cost, full privacy)
ollama pull llama3.2
pytest --failscope --fs-provider=ollama
Offline / no API key
pytest --failscope --fs-offline
How It Works
Test Failure
│
▼
┌──────────────────────┐
│ Log Preprocessor │ Strip pytest noise, smart truncate
│ │ (first 10% + last 90%), sanitize PII
└─────────┬────────────┘
│
▼
┌──────────────────────┐
│ Error Fingerprinting│ SHA-256 hash per unique error class
│ │ Deduplicates before reaching the LLM
└─────────┬────────────┘
│
▼ (parallel — all unique failures at once)
┌──────────────────────┐ ┌──────────────────────┐
│ Analyzer (temp 0.4) │ │ Analyzer (temp 0.4) │ ...
│ [Actor Agent] │ │ [Actor Agent] │
└─────────┬────────────┘ └─────────┬────────────┘
│ │
▼ ▼
┌──────────────────────┐ ┌──────────────────────┐
│ Critic (temp 0.0) │ │ Critic (temp 0.0) │
│ Validates claims │ │ Validates claims │
│ overrides hallucin. │ │ overrides hallucin. │
└─────────┬────────────┘ └─────────┬────────────┘
└──────────┬────────────────┘
▼
HTML + JSON reports in .failscope/
Ollama note: For local models (3B–8B params), Failscope automatically switches to a single-pass prompt to stay within context window limits.
CLI Options
| Flag | Default | Description |
|---|---|---|
--failscope |
— | Enable Failscope analysis |
--fs-offline |
false |
Rule-based analysis, no API key needed |
--fs-report |
false |
Add stability report to output |
--fs-provider |
auto-detect | groq · openai · anthropic · ollama |
--fs-model |
provider default | Override model name (e.g. llama3.1:8b, gpt-4o-mini) |
--fs-max-log-size |
80000 |
Max log characters sent to LLM. Reduce for small local models |
--fs-output |
.failscope/ |
Output directory for reports |
LLM Providers
| Provider | Default model | Env variable | Cost |
|---|---|---|---|
| Groq (default) | llama-3.3-70b-versatile |
GROQ_API_KEY |
Free tier available |
| OpenAI | gpt-4o |
OPENAI_API_KEY |
Pay per token |
| Anthropic | claude-haiku-4-5-20251001 |
ANTHROPIC_API_KEY |
Pay per token |
| Ollama | llama3.2 |
OLLAMA_HOST (optional) |
Free, runs locally |
Auto-detection order: OLLAMA_HOST → GROQ_API_KEY → OPENAI_API_KEY → ANTHROPIC_API_KEY
Override the model without changing provider:
pytest --failscope --fs-provider=openai --fs-model=gpt-4o-mini
pytest --failscope --fs-provider=ollama --fs-model=mistral:7b
Output
All reports are written to .failscope/ (configurable with --fs-output).
rca_report.html — interactive HTML report (always generated)
A self-contained file you can open in any browser or attach to a Slack message.
rca_report.json — machine-readable RCA
{
"root_cause": "API endpoint /login returns 401 due to expired test token",
"category": "assertion_failure",
"severity": "high",
"fix_suggestion": "Refresh auth token in conftest.py fixture before each test",
"confidence": 0.87,
"was_critic_override": false,
"affected_tests": ["test_auth.py::test_login", "test_auth.py::test_profile"],
"occurrence_count": 2
}
stability_report.json — A–F grading per test (requires --fs-report)
{
"test_name": "test_checkout.py::test_payment_flow",
"grade": "C",
"pass_rate": "72.0%",
"flakiness_score": 58,
"verdict": "Flaky",
"trend": "degrading"
}
Security
Failscope sanitizes the following before sending any data to an LLM API:
- API keys and tokens (generic patterns, GitHub PATs, OpenAI/Anthropic/Stripe prefixes)
- Passwords and secrets in assignment context (
key=value,"key": "value",key: value) - JWT tokens, Bearer tokens, AWS access keys
- Database connection strings containing credentials
- Email addresses and high-entropy hex strings
Redacted values appear as typed placeholders: [REDACTED:api_key], [REDACTED:password], etc.
A warning is printed to the terminal whenever a redaction occurs.
Environment Variables
| Variable | Description |
|---|---|
GROQ_API_KEY |
Groq API key |
OPENAI_API_KEY |
OpenAI API key |
ANTHROPIC_API_KEY |
Anthropic API key |
OLLAMA_HOST |
Ollama server URL (default: http://localhost:11434) |
OLLAMA_MODEL |
Default Ollama model (default: llama3.2) |
License
MIT
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file failscope-0.3.0.tar.gz.
File metadata
- Download URL: failscope-0.3.0.tar.gz
- Upload date:
- Size: 30.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
cafc4cb1f35c725edaa8714e5d3e440079942a5950fff1303f2a5a565daef14a
|
|
| MD5 |
8dfc703a8d6119a3af0c8bdeb8e818dd
|
|
| BLAKE2b-256 |
6c7aade300d0421df964f95430b50ae00a5ef670d196fb09e69fdb32ecd65a95
|
File details
Details for the file failscope-0.3.0-py3-none-any.whl.
File metadata
- Download URL: failscope-0.3.0-py3-none-any.whl
- Upload date:
- Size: 29.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
dd2ead61e17f18bf34235aba456a224b658c6a152995bbe1e57ebcf6b3a0ea9f
|
|
| MD5 |
aa7dc652b2957aefff1a47416ec268e8
|
|
| BLAKE2b-256 |
59b4671d63240d9a3e7679126fc5a1555faddb8bada1749cf251605a1181100d
|