CLI tool to ingest test reports, detect flaky/regression patterns, and predict failures.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

These details have not been verified by PyPI

Project description

TestMind

A CLI tool and Python library for ingesting test reports, detecting patterns (flaky tests, regressions, spikes), and predicting failures based on historical execution data.

Supports JUnit XML today, with the parser interface open to CSV, HTML, and other formats.

Installation

CLI tool (recommended)

# With uv (recommended)
uv tool install testmind

# With pip
pip install testmind

testmind --help

As a library in your project

# With uv
uv add testmind

# With pip
pip install testmind

For development

git clone https://github.com/Slaaayer/testmind
cd testmind
uv sync

Quick start

# First time: bulk-load historical reports to get meaningful analysis immediately
testmind ingest reports/history/*.xml --project my-service

# Day-to-day: ingest the latest run
testmind ingest reports/junit.xml --project my-service

# Check which projects you are tracking
testmind projects

# Re-run analysis on the latest stored run
testmind analyze my-service

# Browse the run history
testmind history my-service

By default the database lives at ~/.testmind/testmind.db. Override it with --db <path> or the TESTMIND_DB environment variable.

Commands

`ingest` — parse, store, analyse

testmind ingest <FILE> [FILE ...] --project <NAME> [OPTIONS]

Accepts one or more JUnit XML files. Each file is parsed, stored, and counted. After all files are processed a single analysis summary is printed, covering the full available history.

This makes it possible to bootstrap a project on the first run by pointing at an archive of historical reports — patterns like flaky tests or regressions are only detectable once enough history exists, so bulk-loading is the recommended first step.

Each file is processed independently: a parse error on one file prints a warning and moves on; the command only exits with code 1 if every file fails. Duplicate reports (same content hash) are silently skipped, so running the same command twice is always safe.

Option	Default	Description
`--project / -p`	required	Project name to track the run under
`--format / -f`	`text`	Output format: `text` or `json`
`--db`	`~/.testmind/testmind.db`	SQLite database file
`--limit / -n`	`30`	Max historical reports loaded for analysis

# First run: load a full archive to seed history
testmind ingest reports/history/*.xml --project payments-service

# Day-to-day: ingest the latest CI run
testmind ingest build/reports/TEST-suite.xml --project payments-service

# JSON output — useful in CI pipelines
testmind ingest reports/junit.xml --project auth-service --format json

# Project-scoped database
testmind ingest reports/*.xml --project orders --db ./data/orders.db

# Override DB via env var
TESTMIND_DB=./ci.db testmind ingest reports/junit.xml --project api

Example output for a bulk ingest:

Ingesting 5 reports for project 'payments-service'...
  [1/5] TEST-2024-01-01.xml           stored 'nightly-2024-01-01'  [87✓  3✗  2⊘  0!]
  [2/5] TEST-2024-01-02.xml           stored 'nightly-2024-01-02'  [90✓  0✗  2⊘  0!]
  [3/5] TEST-2024-01-03.xml           stored 'nightly-2024-01-03'  [88✓  2✗  2⊘  0!]
  [4/5] TEST-2024-01-04.xml           stored 'nightly-2024-01-04'  [91✓  0✗  1⊘  0!]
  [5/5] TEST-2024-01-05.xml           stored 'nightly-2024-01-05'  [85✓  5✗  2⊘  0!]

5 stored.

TestMind Report — project: payments-service
Run: nightly-2024-01-05  |  2024-01-05 10:00:00 UTC  |  Duration: 12.34s
...

`analyze` — re-run analysis on the latest run

testmind analyze <PROJECT> [OPTIONS]

Runs the full analysis pipeline against the most recent stored run without re-parsing anything. Useful when you want to re-inspect results after changing thresholds or after more history has accumulated.

Option	Default	Description
`--format / -f`	`text`	`text` or `json`
`--db`	`~/.testmind/testmind.db`	SQLite database file
`--limit / -n`	`30`	Max historical reports loaded

testmind analyze payments-service
testmind analyze payments-service --format json | jq '.flaky'

`projects` — list tracked projects

testmind projects [--db <path>]

Prints a table of all projects with their run count and the timestamp of the most recent run.

Project                                  Reports  Latest run
----------------------------------------------------------------------
auth-service                                  12  2024-06-15 09:45
orders-service                                 8  2024-06-14 22:10
payments-service                              31  2024-06-15 10:00

`history` — browse run history

testmind history <PROJECT> [--limit N] [--db <path>]

Prints a chronological table (newest first) of all stored runs for a project.

History for 'payments-service'  (showing 5 run(s))

Run                                  Timestamp               Pass   Fail   Skip    Err   Duration
--------------------------------------------------------------------------------------------------
nightly-2024-06-15                   2024-06-15 10:00:00       87      3      2      0     12.34s
nightly-2024-06-14                   2024-06-14 10:00:01       90      0      2      0     11.90s
nightly-2024-06-13                   2024-06-13 10:00:00       88      2      2      0     12.01s

testmind history payments-service --limit 5
testmind history payments-service --limit 100 --db ./archive.db

Output formats

Text (default)

The text report is structured in sections, printed only when there is something to show.

TestMind Report — project: payments-service
Run: nightly-2024-06-15  |  2024-06-15 10:00:00 UTC  |  Duration: 12.34s
────────────────────────────────────────────────────────────
OVERVIEW
  Total: 92   Passed: 87   Failed: 3   Skipped: 2   Errors: 0
  Pass rate: 94.6%   Fail rate: 3.3%

FLAKY TESTS  (2)
  test_process_refund                               flip=70.0%  fail=40.0%  runs=10
  test_currency_conversion                          flip=60.0%  fail=30.0%  runs=10

REGRESSIONS  (1)
  test_checkout_timeout                             ref_pass=100.0%  recent_fail=66.7%

STABILITY INDEX  (worst 10 of 87 tests)
  Test                                               Score  Pass    Consist  Flips
  test_process_refund                                 38.0  60.0%  95.0%   70.0%
  test_currency_conversion                            44.0  70.0%  92.0%   60.0%
  ...

FAILURE PREDICTIONS  (top 10 by risk)
  Test                                               Prob    Trend       Confidence
  test_checkout_timeout                              78.0%  degrading   55.0%
  test_process_refund                                45.0%  stable      50.0%
  ...

ISSUES: 2 flaky  |  1 regression(s)  |  0 spike(s)

A spike banner is injected at the top when a sudden suite-wide failure surge is detected:

  FAILURE SPIKE DETECTED
  Current fail rate : 48.0%
  Baseline          : 3.2% ± 1.1%
  Z-score           : 40.73

JSON

Pass --format json to get a machine-readable object. Useful for piping into jq, posting to Slack, or feeding downstream tools.

{
  "project": "payments-service",
  "report": {
    "id": "a3f9c...",
    "name": "nightly-2024-06-15",
    "timestamp": "2024-06-15T10:00:00+00:00",
    "duration": 12.34,
    "passed": 87,
    "failed": 3,
    "skipped": 2,
    "errors": 0,
    "total": 92,
    "pass_rate": 0.9457,
    "fail_rate": 0.0326
  },
  "issues": {
    "flaky_count": 2,
    "regression_count": 1,
    "spike_detected": false
  },
  "flaky": [
    {
      "test_name": "test_process_refund",
      "is_flaky": true,
      "flip_rate": 0.7,
      "pass_rate": 0.6,
      "fail_rate": 0.4,
      "run_count": 10,
      "insufficient_data": false
    }
  ],
  "regressions": [ ... ],
  "spike": null,
  "stability": [ ... ],
  "predictions": [ ... ]
}

# Extract only flaky tests from a CI run
testmind ingest reports/junit.xml --project api --format json \
  | tail -n +2 \
  | jq '[.flaky[] | {test: .test_name, flip_rate: .flip_rate}]'

# Fail CI if regressions are detected
COUNT=$(testmind analyze my-service --format json | jq '.issues.regression_count')
[ "$COUNT" -gt 0 ] && exit 1

Python library usage

Every component is importable and composable independently.

Parse a report

from testmind.parsers.junit_parser import JUnitParser

parser = JUnitParser()
report = parser.parse("reports/junit.xml", project="my-service")

print(report.name, report.pass_rate, report.fail_rate)
for test in report.tests:
    print(test.name, test.status, test.duration)

Store and retrieve history

from testmind.storage.sqlite_store import SQLiteStore

store = SQLiteStore("~/.testmind/my-service.db")
store.save_report(report)

# All runs for a project, newest first
reports = store.get_reports("my-service", limit=20)

# Per-test history across runs: list[(datetime, TestResult)]
history = store.get_test_history("my-service", "test_checkout", limit=30)

store.close()

Run individual analysers

from testmind.analysis.flaky import FlakyDetector
from testmind.analysis.regression import RegressionDetector, SpikeDetector
from testmind.analysis.stability import StabilityAnalyzer
from testmind.analysis.predictor import FailurePredictor

history = store.get_test_history("my-service", "test_checkout", limit=30)

flaky   = FlakyDetector().analyze("test_checkout", history)
regr    = RegressionDetector().analyze("test_checkout", history)
stable  = StabilityAnalyzer().analyze("test_checkout", history)
pred    = FailurePredictor().analyze("test_checkout", history)

print(flaky.is_flaky, flaky.flip_rate)
print(regr.is_regression)
print(stable.score)         # 0–100
print(pred.failure_probability, pred.trend)

Generate a full summary

from testmind.reports.summary import Summarizer
from testmind.reports.formatters import TextFormatter, JsonFormatter

# report must already be saved in the store
summarizer = Summarizer(history_limit=30)
summary = summarizer.summarize("my-service", store)

print(TextFormatter().format(summary))
print(JsonFormatter().format(summary))

Under the hood

Storage

All data is persisted in a SQLite database (stdlib sqlite3, no ORM). Two tables:

reports — one row per ingested run (name, project, timestamp, pass/fail/skip/error counts, duration)
test_results — one row per test case, linked to its report

Reports are deduplicated by a SHA-256 content hash derived from project name, duration, timestamp, and test count. Ingesting the same file twice is always safe.

Pattern detection

All analysers operate on the per-test history: a list of (timestamp, TestResult) pairs retrieved from the store. They require a minimum number of runs before drawing conclusions (insufficient_data=True is returned otherwise).

Flaky test

A test is flaky when it produces mixed results without a clear directional trend.

is_flaky = fail_rate ∈ (0.10, 0.90)   # not consistently passing or failing
         AND flip_rate > 0.15          # consecutive outcomes differ often

flip_rate = |{consecutive pairs that differ}| / (n - 1)

Default minimum: 5 runs.

Regression

A test is a regression when it was stable and has recently broken.

reference window = all runs except the last 3
recent window    = last 3 runs

is_regression = reference_pass_rate >= 0.90   # was stable
              AND recent_fail_rate  >= 0.60   # now failing

Default minimum: 6 runs total.

Spike

A spike is a sudden suite-wide increase in failure rate in the latest run compared to the rolling baseline.

baseline        = fail_rate of all previous runs in the window
z_score         = (current_fail_rate - baseline_mean) / baseline_std

is_spike        = z_score >= 2.0 AND current_fail_rate > baseline_mean

Requires at least 3 baseline reports.

Stability index (0 – 100)

A composite score per test:

score = pass_rate            × 60
      + duration_consistency × 20
      + (1 − flip_rate)      × 20

duration_consistency = 1 − min(CV, 1)
  where CV = std(durations) / mean(durations)

A perfectly stable test (always passes, consistent timing, never flips) scores 100. A consistently failing test with stable timing scores 40. A maximally flaky test scores near 0.

Failure prediction

A lightweight trend model — no external dependencies, no ML framework.

1. Encode each run as 1.0 (fail/error) or 0.0 (pass/skip).
2. Fit an OLS linear regression on the sequence (index → outcome).
3. Predict next value = mean(last 3 outcomes) + slope.
4. Clamp to [0, 1].

slope > +0.05  → DEGRADING
slope < −0.05  → IMPROVING
otherwise      → STABLE

confidence = min(run_count / 20, 1.0)

Architecture

src/testmind/
├── domain/
│   └── models.py          TestResult, TestReport, TestStatus
├── parsers/
│   ├── base.py            Abstract ReportParser
│   └── junit_parser.py    JUnit XML parser
├── storage/
│   ├── base.py            Abstract Store
│   └── sqlite_store.py    SQLite implementation
├── analysis/
│   ├── models.py          Result dataclasses + Trend enum
│   ├── flaky.py           FlakyDetector
│   ├── regression.py      RegressionDetector, SpikeDetector
│   ├── stability.py       StabilityAnalyzer
│   └── predictor.py       FailurePredictor
├── reports/
│   ├── summary.py         RunSummary, Summarizer
│   └── formatters.py      TextFormatter, JsonFormatter
└── cli/
    └── app.py             Typer CLI (ingest, analyze, projects, history)

Contributing

Clone the repo and run the test suite:

git clone https://github.com/Slaaayer/testmind
cd testmind
uv sync
uv run pytest              # all 173 tests
uv run pytest --cov=src/testmind --cov-report=term-missing

Built with

This tool was built with the help of Claude — who wrote the tests, questioned every design decision, and occasionally suggested variable names that were suspiciously too good. The bugs are mine. The clean abstractions are probably Claude's.

Configuration reference

Env var	CLI flag	Default	Description
`TESTMIND_DB`	`--db`	`~/.testmind/testmind.db`	Path to the SQLite database
—	`--format`	`text`	Output format for `ingest` and `analyze`
—	`--limit`	`30`	Max historical reports loaded per analysis

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

Slaaayer

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.1.3

Mar 11, 2026

0.1.2

Mar 8, 2026

This version

0.1.1

Mar 8, 2026

0.1.0

Mar 7, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

testmind-0.1.1.tar.gz (53.7 kB view details)

Uploaded Mar 8, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

testmind-0.1.1-py3-none-any.whl (28.1 kB view details)

Uploaded Mar 8, 2026 Python 3

File details

Details for the file testmind-0.1.1.tar.gz.

File metadata

Download URL: testmind-0.1.1.tar.gz
Upload date: Mar 8, 2026
Size: 53.7 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for testmind-0.1.1.tar.gz
Algorithm	Hash digest
SHA256	`cc482d78abf55534ff5dfbcdef0369ed57e6d46a4a5bf963c813b2f0fc166a39`
MD5	`56a87114394574315655fa8145900273`
BLAKE2b-256	`adfeb29fab0799e8358160c6c38fb1258a8bd4738c3b3147ddc202a98bc1945b`

See more details on using hashes here.

Provenance

The following attestation bundles were made for testmind-0.1.1.tar.gz:

Publisher: publish.yml on Slaaayer/testmind

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: testmind-0.1.1.tar.gz
- Subject digest: cc482d78abf55534ff5dfbcdef0369ed57e6d46a4a5bf963c813b2f0fc166a39
- Sigstore transparency entry: 1058663091
- Sigstore integration time: Mar 8, 2026
Source repository:
- Permalink: Slaaayer/testmind@38f229a2211bef43db4afe116f6258e967d037c8
- Branch / Tag: refs/tags/v0.1.2
- Owner: https://github.com/Slaaayer
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@38f229a2211bef43db4afe116f6258e967d037c8
- Trigger Event: push

File details

Details for the file testmind-0.1.1-py3-none-any.whl.

File metadata

Download URL: testmind-0.1.1-py3-none-any.whl
Upload date: Mar 8, 2026
Size: 28.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for testmind-0.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`80ff40d1b3a077ad4d6a792bfeca4ef4a4338f4291fe7da8d4b0db5c534fc3fe`
MD5	`279f1cea7922144b01a7065e3222de60`
BLAKE2b-256	`bf5c8991d4d6f3571fca8d3e284d8d24b9541acec8b22b2966be6a64a0d9a52f`

See more details on using hashes here.

Provenance

The following attestation bundles were made for testmind-0.1.1-py3-none-any.whl:

Publisher: publish.yml on Slaaayer/testmind

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: testmind-0.1.1-py3-none-any.whl
- Subject digest: 80ff40d1b3a077ad4d6a792bfeca4ef4a4338f4291fe7da8d4b0db5c534fc3fe
- Sigstore transparency entry: 1058663092
- Sigstore integration time: Mar 8, 2026
Source repository:
- Permalink: Slaaayer/testmind@38f229a2211bef43db4afe116f6258e967d037c8
- Branch / Tag: refs/tags/v0.1.2
- Owner: https://github.com/Slaaayer
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@38f229a2211bef43db4afe116f6258e967d037c8
- Trigger Event: push

testmind 0.1.1

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

TestMind

Installation

CLI tool (recommended)

As a library in your project

For development

Quick start

Commands

ingest — parse, store, analyse

analyze — re-run analysis on the latest run

projects — list tracked projects

history — browse run history

Output formats

Text (default)

JSON

Python library usage

Parse a report

Store and retrieve history

Run individual analysers

Generate a full summary

Under the hood

Storage

Pattern detection

Flaky test

Regression

Spike

Stability index (0 – 100)

Failure prediction

Architecture

Contributing

Built with

Configuration reference

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

`ingest` — parse, store, analyse

`analyze` — re-run analysis on the latest run

`projects` — list tracked projects

`history` — browse run history