Root cause in seconds. Evidence, not intuition. Analyzes Playwright, Jest, Cypress, Newman, k6, and JUnit test failures by tracing git history, logs, and config. No guesses, no fixture noise.
Project description
🩻 ai-test-failure-analyzer
Root cause in seconds. Evidence, not intuition.
Feed it a real test result file — Playwright, Jest, Cypress, Newman, k6, or JUnit —
and it traces back through your real git history, application logs, and config
to surface the actual root cause, with a cited evidence chain and file:line precision.
No guesses. No fixture noise. No repeating the obvious.
🩻 A real analysis — evidence from git, app.log, and .env — no guesses, no fixture noise.
Why ai-test-failure-analyzer
Manual test failure investigation takes 30–60 minutes: open the test output, grep through logs, dig through git history, check recent deploys, ask Slack. And you can still point at the wrong thing — especially when the test file itself has an "intentional failure" comment or a fixture designed to trigger the analyzer.
This tool does it automatically in seconds:
- Parses the test result file to extract failing tests with HTTP details
- Scans git history for high-risk commits (endpoint renames, migrations, auth changes)
- Scans application logs for ERROR/FATAL lines
- Reads config files (.env, docker-compose)
- Cross-correlates all evidence into clusters
- Forms ranked, evidence-cited hypotheses with
file:lineprecision - Never points to test fixtures or "intentional failure" comments as root causes
How it's different
| ai-test-failure-analyzer | Manual triage | Generic LLM | |
|---|---|---|---|
| Evidence source | Real git/logs/config | Human memory | Training data |
| Fixture noise | Blocked by Tier-1 gate | No protection | No protection |
file:line precision |
✅ | Sometimes | No |
| Works without source code | ✅ API-only mode | ✅ | ✅ |
| Repeatable | ✅ | ❌ | ❌ |
| CI-integrated | ✅ | ❌ | ❌ |
Supported frameworks
| Framework | Format | Command |
|---|---|---|
| Playwright | JSON reporter | playwright test --reporter=json |
| Jest / Vitest | JSON | jest --json --outputFile=results.json |
| Cypress | Mochawesome JSON | cypress run --reporter mochawesome |
| pytest | JUnit XML | pytest --junit-xml=results.xml |
| Newman (Postman) | JSON | newman run col.json --reporters json --reporter-json-export results.json |
| k6 | Summary JSON | k6 run --summary-export=results.json script.js |
| REST Assured | JUnit XML | standard Maven Surefire output |
| Any JUnit-compatible | XML | TestNG, Karate, Insomnia CLI |
Install
npm (global — JS/CI devs):
npm install -g ai-test-failure-analyzer
ai-analyze analyze playwright-report.json
npx (zero install):
npx ai-test-failure-analyzer analyze playwright-report.json
pipx (Python devs):
pipx install ai-test-failure-analyzer
analyzer analyze playwright-report.json
Claude Code skill:
/plugin install ai-test-failure-analyzer
Install skill to all agents (Claude, Cursor, Codex, Gemini, Windsurf):
ai-analyze install
Usage
CLI
ai-analyze analyze results.json
ai-analyze analyze results.json --mode api-only # force API-only (no source scan)
ai-analyze analyze results.json --out report.md # write report to file
ai-analyze analyze results.json --create-issue # file GitHub issue for top hypothesis
MCP server (Claude Code / Cursor)
Add to your MCP config:
{
"mcpServers": {
"ai-test-failure-analyzer": {
"command": "ai-analyze",
"args": ["serve-stdio"]
}
}
}
Then ask Claude: "Analyze the failures in playwright-report.json"
MCP HTTP (OpenAI / Gemini)
ai-analyze serve-http --port 8765
API-only mode
No source code? No problem. When your workspace has no src/, app/, lib/, or api/ directory — or when you pass --mode api-only — the tool switches to API-only mode.
It analyzes HTTP contract evidence directly from the test results:
ai-analyze analyze newman-results.json
# > API_ONLY mode — no workspace source detected, analyzing HTTP contract only
# Root Cause [95%] — POST /api/clips → 404 Not Found
# Endpoint moved or removed. Check API changelog or versioning.
# Evidence: response status 404 + URL /api/clips
Supports Newman, k6, Playwright (API tests), Jest, and any framework that records HTTP status codes.
CI integration
# .github/workflows/analyze-failures.yml
- name: Analyze test failures
if: failure()
run: |
npx ai-test-failure-analyzer analyze test-results/results.json \
--non-interactive \
--out failure-analysis.md
- uses: actions/upload-artifact@v4
if: failure()
with:
name: failure-analysis
path: failure-analysis.md
Security
- No shell injection: all subprocess calls use explicit argument lists
- Path traversal protection: all paths resolved relative to workspace root
- Size caps: 5 MB/file, 50 MB/scan, 200 commits max
- Secrets redacted:
.envtoken/secret/key/password values masked in reports - No outbound network from core analysis (GitHub issue creation is opt-in)
See SECURITY.md for the full threat model.
Repository layout
analyzer/ Python package (MCP server + CLI + analysis)
parsers/ Framework parsers (Playwright, Jest, Cypress, Newman, k6, JUnit)
evidence/ Evidence collection (git, logs, config)
render/ Report rendering (Markdown, ANSI)
ui/ User interfaces (CLI, TUI, Web)
workspace_scanner.py Phase 0 — mode detection, noise path discovery
noise_filter.py Evidence filtering and hypothesis deduplication
orchestrator.py 8-phase analysis pipeline
hypothesis.py Confidence scoring and hypothesis formation
bin/cli.js Zero-dep Node wrapper (ai-analyze command)
skills/ai-test-failure-analyzer/SKILL.md Claude Code agent skill
.claude-plugin/ Claude marketplace manifests
tests/analyzer/ pytest test suite
.github/workflows/ CI/CD (ci, release, publish, codeql)
Testing
pytest tests/analyzer -q # Python: parsers, correlator, noise filter, workspace scanner
npm test # Node: CLI smoke tests
Contributing
See CONTRIBUTING.md.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ai_test_failure_analyzer-1.0.2.tar.gz.
File metadata
- Download URL: ai_test_failure_analyzer-1.0.2.tar.gz
- Upload date:
- Size: 47.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6c036c01618af2969d8b393cb4af48a26e7eb2aa2c6dd986f234c20b4e84a67d
|
|
| MD5 |
7958a312d7ca0ca95d549f640a769f0a
|
|
| BLAKE2b-256 |
019410a50fde48eecf35dca8b994c3c0a396f2159e89ddc555c160cdb6f94466
|
File details
Details for the file ai_test_failure_analyzer-1.0.2-py3-none-any.whl.
File metadata
- Download URL: ai_test_failure_analyzer-1.0.2-py3-none-any.whl
- Upload date:
- Size: 61.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
610dcb0167137e920cddbe8b7bead184f09bd536c996797528334907d3e7ccfe
|
|
| MD5 |
907f4c32303f1666ab1a7be3edd592fe
|
|
| BLAKE2b-256 |
314f600facba03ae3bb4ceeb8155094ac746751b1ee3728db4702f2b444d053a
|