Skip to main content

Root cause in seconds. Evidence, not intuition. Analyzes Playwright, Jest, Cypress, Newman, k6, and JUnit test failures by tracing git history, logs, and config. No guesses, no fixture noise.

Project description

🩻 ai-test-failure-analyzer


Root cause in seconds. Evidence, not intuition.

Feed it a real test result file — Playwright, Jest, Cypress, Newman, k6, or JUnit — and it traces back through your real git history, application logs, and config to surface the actual root cause, with a cited evidence chain and file:line precision. No guesses. No fixture noise. No repeating the obvious.

CI CodeQL npm PyPI License: MIT MCP server Agent Skill

ai-analyze running 8-phase analysis

🩻 A real analysis — evidence from git, app.log, and .env — no guesses, no fixture noise.


Why ai-test-failure-analyzer

Manual test failure investigation takes 30–60 minutes: open the test output, grep through logs, dig through git history, check recent deploys, ask Slack. And you can still point at the wrong thing — especially when the test file itself has an "intentional failure" comment or a fixture designed to trigger the analyzer.

This tool does it automatically in seconds:

  • Parses the test result file to extract failing tests with HTTP details
  • Scans git history for high-risk commits (endpoint renames, migrations, auth changes)
  • Scans application logs for ERROR/FATAL lines
  • Reads config files (.env, docker-compose)
  • Cross-correlates all evidence into clusters
  • Forms ranked, evidence-cited hypotheses with file:line precision
  • Never points to test fixtures or "intentional failure" comments as root causes

How it's different

ai-test-failure-analyzer Manual triage Generic LLM
Evidence source Real git/logs/config Human memory Training data
Fixture noise Blocked by Tier-1 gate No protection No protection
file:line precision Sometimes No
Works without source code ✅ API-only mode
Repeatable
CI-integrated

Supported frameworks

Framework Format Command
Playwright JSON reporter playwright test --reporter=json
Jest / Vitest JSON jest --json --outputFile=results.json
Cypress Mochawesome JSON cypress run --reporter mochawesome
pytest JUnit XML pytest --junit-xml=results.xml
Newman (Postman) JSON newman run col.json --reporters json --reporter-json-export results.json
k6 Summary JSON k6 run --summary-export=results.json script.js
REST Assured JUnit XML standard Maven Surefire output
Any JUnit-compatible XML TestNG, Karate, Insomnia CLI

Install

npm (global — JS/CI devs):

npm install -g ai-test-failure-analyzer
ai-analyze analyze playwright-report.json

npx (zero install):

npx ai-test-failure-analyzer analyze playwright-report.json

pipx (Python devs):

pipx install ai-test-failure-analyzer
analyzer analyze playwright-report.json

Claude Code skill:

/plugin install ai-test-failure-analyzer

Install skill to all agents (Claude, Cursor, Codex, Gemini, Windsurf):

ai-analyze install

Usage

CLI

ai-analyze analyze results.json
ai-analyze analyze results.json --mode api-only    # force API-only (no source scan)
ai-analyze analyze results.json --out report.md    # write report to file
ai-analyze analyze results.json --create-issue     # file GitHub issue for top hypothesis

MCP server (Claude Code / Cursor)

Add to your MCP config:

{
  "mcpServers": {
    "ai-test-failure-analyzer": {
      "command": "ai-analyze",
      "args": ["serve-stdio"]
    }
  }
}

Then ask Claude: "Analyze the failures in playwright-report.json"

MCP HTTP (OpenAI / Gemini)

ai-analyze serve-http --port 8765

API-only mode

No source code? No problem. When your workspace has no src/, app/, lib/, or api/ directory — or when you pass --mode api-only — the tool switches to API-only mode.

It analyzes HTTP contract evidence directly from the test results:

ai-analyze analyze newman-results.json
# > API_ONLY mode — no workspace source detected, analyzing HTTP contract only
# Root Cause [95%] — POST /api/clips → 404 Not Found
#   Endpoint moved or removed. Check API changelog or versioning.
#   Evidence: response status 404 + URL /api/clips

Supports Newman, k6, Playwright (API tests), Jest, and any framework that records HTTP status codes.

CI integration

# .github/workflows/analyze-failures.yml
- name: Analyze test failures
  if: failure()
  run: |
    npx ai-test-failure-analyzer analyze test-results/results.json \
      --non-interactive \
      --out failure-analysis.md
- uses: actions/upload-artifact@v4
  if: failure()
  with:
    name: failure-analysis
    path: failure-analysis.md

Security

  • No shell injection: all subprocess calls use explicit argument lists
  • Path traversal protection: all paths resolved relative to workspace root
  • Size caps: 5 MB/file, 50 MB/scan, 200 commits max
  • Secrets redacted: .env token/secret/key/password values masked in reports
  • No outbound network from core analysis (GitHub issue creation is opt-in)

See SECURITY.md for the full threat model.

Repository layout

analyzer/                   Python package (MCP server + CLI + analysis)
  parsers/                  Framework parsers (Playwright, Jest, Cypress, Newman, k6, JUnit)
  evidence/                 Evidence collection (git, logs, config)
  render/                   Report rendering (Markdown, ANSI)
  ui/                       User interfaces (CLI, TUI, Web)
  workspace_scanner.py      Phase 0 — mode detection, noise path discovery
  noise_filter.py           Evidence filtering and hypothesis deduplication
  orchestrator.py           8-phase analysis pipeline
  hypothesis.py             Confidence scoring and hypothesis formation
bin/cli.js                  Zero-dep Node wrapper (ai-analyze command)
skills/ai-test-failure-analyzer/SKILL.md  Claude Code agent skill
.claude-plugin/             Claude marketplace manifests
tests/analyzer/             pytest test suite
.github/workflows/          CI/CD (ci, release, publish, codeql)

Testing

pytest tests/analyzer -q    # Python: parsers, correlator, noise filter, workspace scanner
npm test                    # Node: CLI smoke tests

Contributing

See CONTRIBUTING.md.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ai_test_failure_analyzer-1.0.2.tar.gz (47.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ai_test_failure_analyzer-1.0.2-py3-none-any.whl (61.7 kB view details)

Uploaded Python 3

File details

Details for the file ai_test_failure_analyzer-1.0.2.tar.gz.

File metadata

  • Download URL: ai_test_failure_analyzer-1.0.2.tar.gz
  • Upload date:
  • Size: 47.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for ai_test_failure_analyzer-1.0.2.tar.gz
Algorithm Hash digest
SHA256 6c036c01618af2969d8b393cb4af48a26e7eb2aa2c6dd986f234c20b4e84a67d
MD5 7958a312d7ca0ca95d549f640a769f0a
BLAKE2b-256 019410a50fde48eecf35dca8b994c3c0a396f2159e89ddc555c160cdb6f94466

See more details on using hashes here.

File details

Details for the file ai_test_failure_analyzer-1.0.2-py3-none-any.whl.

File metadata

File hashes

Hashes for ai_test_failure_analyzer-1.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 610dcb0167137e920cddbe8b7bead184f09bd536c996797528334907d3e7ccfe
MD5 907f4c32303f1666ab1a7be3edd592fe
BLAKE2b-256 314f600facba03ae3bb4ceeb8155094ac746751b1ee3728db4702f2b444d053a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page