Skip to main content

Root cause in seconds. Evidence, not intuition. Analyzes Playwright, Jest, Cypress, Newman, k6, and JUnit test failures by tracing git history, logs, and config. No guesses, no fixture noise.

Project description

🩻 ai-test-failure-analyzer


Root cause in seconds. Evidence, not intuition.

Feed it a real test result file — Playwright, Jest, Cypress, Newman, k6, or JUnit — and it traces back through your real git history, application logs, and config to surface the actual root cause, with a cited evidence chain and file:line precision. No guesses. No fixture noise. No repeating the obvious.

CI CodeQL npm PyPI License: MIT MCP server Agent Skill

ai-analyze running 8-phase analysis

🩻 A real analysis — evidence from git, app.log, and .env — no guesses, no fixture noise.


Why ai-test-failure-analyzer

Manual test failure investigation takes 30–60 minutes: open the test output, grep through logs, dig through git history, check recent deploys, ask Slack. And you can still point at the wrong thing — especially when the test file itself has an "intentional failure" comment or a fixture designed to trigger the analyzer.

This tool does it automatically in seconds:

  • Parses the test result file to extract failing tests with HTTP details
  • Scans git history for high-risk commits (endpoint renames, migrations, auth changes)
  • Scans application logs for ERROR/FATAL lines
  • Reads config files (.env, docker-compose)
  • Cross-correlates all evidence into clusters
  • Forms ranked, evidence-cited hypotheses with file:line precision
  • Never points to test fixtures or "intentional failure" comments as root causes

How it's different

ai-test-failure-analyzer Manual triage Generic LLM
Evidence source Real git/logs/config Human memory Training data
Fixture noise Blocked by Tier-1 gate No protection No protection
file:line precision Sometimes No
Works without source code ✅ API-only mode
Repeatable
CI-integrated

Supported frameworks

Framework Format Command
Playwright JSON reporter playwright test --reporter=json
Jest / Vitest JSON jest --json --outputFile=results.json
Cypress Mochawesome JSON cypress run --reporter mochawesome
pytest JUnit XML pytest --junit-xml=results.xml
Newman (Postman) JSON newman run col.json --reporters json --reporter-json-export results.json
k6 Summary JSON k6 run --summary-export=results.json script.js
REST Assured JUnit XML standard Maven Surefire output
Any JUnit-compatible XML TestNG, Karate, Insomnia CLI

Install

npm (global — JS/CI devs):

npm install -g ai-test-failure-analyzer
ai-analyze analyze playwright-report.json

npx (zero install):

npx ai-test-failure-analyzer analyze playwright-report.json

pipx (Python devs):

pipx install ai-test-failure-analyzer
analyzer analyze playwright-report.json

Claude Code skill:

/plugin install ai-test-failure-analyzer

Install skill to all agents (Claude, Cursor, Codex, Gemini, Windsurf):

ai-analyze install

Usage

CLI

ai-analyze analyze results.json
ai-analyze analyze results.json --mode api-only    # force API-only (no source scan)
ai-analyze analyze results.json --out report.md    # write report to file
ai-analyze analyze results.json --create-issue     # file GitHub issue for top hypothesis

MCP server (Claude Code / Cursor)

Add to your MCP config:

{
  "mcpServers": {
    "ai-test-failure-analyzer": {
      "command": "ai-analyze",
      "args": ["serve-stdio"]
    }
  }
}

Then ask Claude: "Analyze the failures in playwright-report.json"

MCP HTTP (OpenAI / Gemini)

ai-analyze serve-http --port 8765

API-only mode

No source code? No problem. When your workspace has no src/, app/, lib/, or api/ directory — or when you pass --mode api-only — the tool switches to API-only mode.

It analyzes HTTP contract evidence directly from the test results:

ai-analyze analyze newman-results.json
# > API_ONLY mode — no workspace source detected, analyzing HTTP contract only
# Root Cause [95%] — POST /api/clips → 404 Not Found
#   Endpoint moved or removed. Check API changelog or versioning.
#   Evidence: response status 404 + URL /api/clips

Supports Newman, k6, Playwright (API tests), Jest, and any framework that records HTTP status codes.

CI integration

# .github/workflows/analyze-failures.yml
- name: Analyze test failures
  if: failure()
  run: |
    npx ai-test-failure-analyzer analyze test-results/results.json \
      --non-interactive \
      --out failure-analysis.md
- uses: actions/upload-artifact@v4
  if: failure()
  with:
    name: failure-analysis
    path: failure-analysis.md

Security

  • No shell injection: all subprocess calls use explicit argument lists
  • Path traversal protection: all paths resolved relative to workspace root
  • Size caps: 5 MB/file, 50 MB/scan, 200 commits max
  • Secrets redacted: .env token/secret/key/password values masked in reports
  • No outbound network from core analysis (GitHub issue creation is opt-in)

See SECURITY.md for the full threat model.

Repository layout

analyzer/                   Python package (MCP server + CLI + analysis)
  parsers/                  Framework parsers (Playwright, Jest, Cypress, Newman, k6, JUnit)
  evidence/                 Evidence collection (git, logs, config)
  render/                   Report rendering (Markdown, ANSI)
  ui/                       User interfaces (CLI, TUI, Web)
  workspace_scanner.py      Phase 0 — mode detection, noise path discovery
  noise_filter.py           Evidence filtering and hypothesis deduplication
  orchestrator.py           8-phase analysis pipeline
  hypothesis.py             Confidence scoring and hypothesis formation
bin/cli.js                  Zero-dep Node wrapper (ai-analyze command)
skills/ai-test-failure-analyzer/SKILL.md  Claude Code agent skill
.claude-plugin/             Claude marketplace manifests
tests/analyzer/             pytest test suite
.github/workflows/          CI/CD (ci, release, publish, codeql)

Testing

pytest tests/analyzer -q    # Python: parsers, correlator, noise filter, workspace scanner
npm test                    # Node: CLI smoke tests

Contributing

See CONTRIBUTING.md.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ai_test_failure_analyzer-1.0.1.tar.gz (46.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ai_test_failure_analyzer-1.0.1-py3-none-any.whl (61.0 kB view details)

Uploaded Python 3

File details

Details for the file ai_test_failure_analyzer-1.0.1.tar.gz.

File metadata

  • Download URL: ai_test_failure_analyzer-1.0.1.tar.gz
  • Upload date:
  • Size: 46.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for ai_test_failure_analyzer-1.0.1.tar.gz
Algorithm Hash digest
SHA256 a3a53a82d2061979a99e36803db725e01ad928f9cae670ccb1d414308e6fc773
MD5 5e1db8f054de1e7f151cd12e20cae19c
BLAKE2b-256 a13f00c0081e6b0f51c090bba730ad625e06fdd877c30ab7eeceb1ddd42ebe47

See more details on using hashes here.

File details

Details for the file ai_test_failure_analyzer-1.0.1-py3-none-any.whl.

File metadata

File hashes

Hashes for ai_test_failure_analyzer-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 e9b47be441f17ee29ac3d9bd28b6b4ea59590c5de503240b1d90cf9d3408d815
MD5 e69518eec75b0a1ca38c28e70a848e4e
BLAKE2b-256 395737c064894077d4317f75c8fce41cdf79ee60c58aabd072849d0456a1a96f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page