Skip to main content

AI-Powered Root Cause Analysis for pytest — Dual-Agent (Analyzer→Critic) pipeline that automatically triages test failures

Project description

Failscope

AI-Powered Root Cause Analysis for pytest

Failscope is a zero-config pytest plugin that automatically triages test failures using a Dual-Agent AI pipeline. It deduplicates failures by fingerprint, runs parallel LLM analysis, and generates an HTML report your team can share — not just raw logs.

Features

  • Dual-Agent RCA — Analyzer (creative, temp 0.4) → Critic (deterministic, temp 0.0) prevents hallucinations by cross-checking every claim against raw evidence
  • Parallel async analysis — all unique failures analysed concurrently; no serial API blocking in CI
  • Error fingerprinting — clusters identical failures, LLM sees only unique root causes
  • PII & secrets sanitization — API keys, passwords, JWTs, and tokens are redacted before leaving your machine
  • HTML report — self-contained single-file report, shareable in Slack or email
  • A–F stability scoring — flakiness detection and trend analysis across the last 20 runs
  • Local LLM support — run fully offline with Ollama (zero API cost, full data privacy)
  • Multi-provider — Groq (free tier), OpenAI, Anthropic, or any Ollama model
  • Offline fallback — rule-based analysis when no API key is available

Quick Start

pip install failscope

Cloud LLM (recommended for best results)

export GROQ_API_KEY=your-key   # free at console.groq.com
pytest --failscope

Local LLM via Ollama (zero cost, full privacy)

ollama pull llama3.2
pytest --failscope --fs-provider=ollama

Offline / no API key

pytest --failscope --fs-offline

How It Works

Test Failure
    │
    ▼
┌──────────────────────┐
│   Log Preprocessor   │  Strip pytest noise, smart truncate
│                      │  (first 10% + last 90%), sanitize PII
└─────────┬────────────┘
          │
          ▼
┌──────────────────────┐
│  Error Fingerprinting│  SHA-256 hash per unique error class
│                      │  Deduplicates before reaching the LLM
└─────────┬────────────┘
          │
          ▼ (parallel — all unique failures at once)
┌──────────────────────┐    ┌──────────────────────┐
│  Analyzer (temp 0.4) │    │  Analyzer (temp 0.4) │  ...
│  [Actor Agent]       │    │  [Actor Agent]       │
└─────────┬────────────┘    └─────────┬────────────┘
          │                           │
          ▼                           ▼
┌──────────────────────┐    ┌──────────────────────┐
│  Critic (temp 0.0)   │    │  Critic (temp 0.0)   │
│  Validates claims    │    │  Validates claims    │
│  overrides hallucin. │    │  overrides hallucin. │
└─────────┬────────────┘    └─────────┬────────────┘
          └──────────┬────────────────┘
                     ▼
          HTML + JSON reports in .failscope/

Ollama note: For local models (3B–8B params), Failscope automatically switches to a single-pass prompt to stay within context window limits.

CLI Options

Flag Default Description
--failscope Enable Failscope analysis
--fs-offline false Rule-based analysis, no API key needed
--fs-report false Add stability report to output
--fs-provider auto-detect groq · openai · anthropic · ollama
--fs-model provider default Override model name (e.g. llama3.1:8b, gpt-4o-mini)
--fs-max-log-size 80000 Max log characters sent to LLM. Reduce for small local models
--fs-output .failscope/ Output directory for reports

LLM Providers

Provider Default model Env variable Cost
Groq (default) llama-3.3-70b-versatile GROQ_API_KEY Free tier available
OpenAI gpt-4o OPENAI_API_KEY Pay per token
Anthropic claude-haiku-4-5-20251001 ANTHROPIC_API_KEY Pay per token
Ollama llama3.2 OLLAMA_HOST (optional) Free, runs locally

Auto-detection order: OLLAMA_HOSTGROQ_API_KEYOPENAI_API_KEYANTHROPIC_API_KEY

Override the model without changing provider:

pytest --failscope --fs-provider=openai --fs-model=gpt-4o-mini
pytest --failscope --fs-provider=ollama --fs-model=mistral:7b

Output

All reports are written to .failscope/ (configurable with --fs-output).

rca_report.html — interactive HTML report (always generated)

A self-contained file you can open in any browser or attach to a Slack message.

rca_report.json — machine-readable RCA

{
  "root_cause": "API endpoint /login returns 401 due to expired test token",
  "category": "assertion_failure",
  "severity": "high",
  "fix_suggestion": "Refresh auth token in conftest.py fixture before each test",
  "confidence": 0.87,
  "was_critic_override": false,
  "affected_tests": ["test_auth.py::test_login", "test_auth.py::test_profile"],
  "occurrence_count": 2
}

stability_report.json — A–F grading per test (requires --fs-report)

{
  "test_name": "test_checkout.py::test_payment_flow",
  "grade": "C",
  "pass_rate": "72.0%",
  "flakiness_score": 58,
  "verdict": "Flaky",
  "trend": "degrading"
}

Security

Failscope sanitizes the following before sending any data to an LLM API:

  • API keys and tokens (generic patterns, GitHub PATs, OpenAI/Anthropic/Stripe prefixes)
  • Passwords and secrets in assignment context (key=value, "key": "value", key: value)
  • JWT tokens, Bearer tokens, AWS access keys
  • Database connection strings containing credentials
  • Email addresses and high-entropy hex strings

Redacted values appear as typed placeholders: [REDACTED:api_key], [REDACTED:password], etc. A warning is printed to the terminal whenever a redaction occurs.

Environment Variables

Variable Description
GROQ_API_KEY Groq API key
OPENAI_API_KEY OpenAI API key
ANTHROPIC_API_KEY Anthropic API key
OLLAMA_HOST Ollama server URL (default: http://localhost:11434)
OLLAMA_MODEL Default Ollama model (default: llama3.2)

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

failscope-0.3.0.tar.gz (30.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

failscope-0.3.0-py3-none-any.whl (29.6 kB view details)

Uploaded Python 3

File details

Details for the file failscope-0.3.0.tar.gz.

File metadata

  • Download URL: failscope-0.3.0.tar.gz
  • Upload date:
  • Size: 30.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.12

File hashes

Hashes for failscope-0.3.0.tar.gz
Algorithm Hash digest
SHA256 cafc4cb1f35c725edaa8714e5d3e440079942a5950fff1303f2a5a565daef14a
MD5 8dfc703a8d6119a3af0c8bdeb8e818dd
BLAKE2b-256 6c7aade300d0421df964f95430b50ae00a5ef670d196fb09e69fdb32ecd65a95

See more details on using hashes here.

File details

Details for the file failscope-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: failscope-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 29.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.12

File hashes

Hashes for failscope-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 dd2ead61e17f18bf34235aba456a224b658c6a152995bbe1e57ebcf6b3a0ea9f
MD5 aa7dc652b2957aefff1a47416ec268e8
BLAKE2b-256 59b4671d63240d9a3e7679126fc5a1555faddb8bada1749cf251605a1181100d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page