Budget-aware, crash-survivable, resumable test supervisor that drives pytest from the outside. Federation-wide sibling of mgf-common under the mgf.* namespace.
Project description
mgf-test-supervisor
Budget-aware, crash-survivable, resumable test supervisor that drives pytest from the outside.
A sibling of mgf-common under the mgf.* namespace. Federation-standard test infrastructure (TS-22..TS-27).
Conformance: L2 per
mgf-common/docs/standards/. Tracker:docs/inprogress/MGF_STANDARDS_CONFORMANCE.md. Runtime LG / CF / OB rules declined per the TS-23 stdlib-only constraint — full rationale in §3 of the tracker.
Why a supervisor (not a pytest plugin)
A pytest plugin lives inside pytest. If pytest crashes — interpreter deadlock, segfault in a C extension, a hook that infinite-loops — the plugin dies with it. The supervisor runs pytest as a subprocess. A pytest crash becomes just one more outcome to record.
What it gives you:
- Hang detection. A per-test
pytest-timeout(in-process) backed by an outer wall-clock SIGTERM → SIGKILL (out-of-process). Tests that genuinely hang are reportedhung, not "still running." - Crash isolation. One chunk's segfault doesn't take the run with it. Subsequent chunks run. The crashed chunk's in-flight test is recorded
crashed. - Budget-aware selection. Pass
--budget 5mand the supervisor packs a greedy knapsack of tests under that wall-clock, biased to tier-0 + recently-failing tests. - Resumability. A run interrupted by SIGINT (or by your laptop sleeping) can be resumed with
--resume. Already-completed tests aren't repeated. - Quarantine. Flaky tests auto-quarantine after 3-of-5 fails. They evict after 3 consecutive passes. No more "this test is flaky, ignore it" Slack pings.
- Per-failure diagnostic bundles. Each failure writes one self-contained
.txtunder.gates/last-run/failures/— pytest stdout/stderr, traceback, durations, environment fingerprint, daemon state (if configured). One file = one diagnostic. - JSONL outcomes.
.gates/last-run/run.jsonlis append-only and parseable. Read viamgf.common.testing.TestRecord.from_jsonl_line(...)or any JSON-aware tool.
Install
uv add --dev mgf-test-supervisor # once published to PyPI
Until PyPI publish, install from Codeberg:
uv add --dev "mgf-test-supervisor @ git+ssh://git@codeberg.org/magogi-admin/mgf-test-supervisor.git"
You also need pytest-timeout for the per-test timeout that backs the supervisor's outer wall-clock:
uv add --dev "pytest-timeout>=2.3,<3"
Adopting in your project
Three steps:
1. Install the dep (above) + pytest-timeout.
2. Add the pytest config to your pyproject.toml.
[tool.pytest.ini_options]
timeout = 30
timeout_method = "thread"
faulthandler_timeout = 60
markers = [
"smoke: minimal post-install sanity (fast; no fixtures); tier-0",
"e2e: end-to-end test exercising bootstrap + multiple subsystems; tier-3",
"integration: cross-module integration test (slower; may launch real subprocesses)",
"contract: wire-format + SDK-surface pins; tier-1",
"property: invariant-under-all-inputs test (Hypothesis); tier-1",
"regression: pin for a specific past bug; docstring cites the fixing commit; tier-1",
"fuzz: random / generated / hostile input — asserts safety properties; tier-1",
"concurrency: real-threads contention test; tier-2",
"conformance: design-promise tests (G/S guarantees); release-gated; tier-2",
"use_case: reproduces a consumer scenario from FEEDBACK paper or recipe doc; tier-2",
"perf: performance pin — pins a hot path's per-call cost (PF-* + TS-14). tier-3",
"stress: sustained high-load test; tier-3; nightly only",
"slow: legacy marker for tests >1s wall-clock (TS-18). Deprecated. Removed in v2.0.",
]
3. Add .gates/ to .gitignore (the supervisor's run state lives there).
.gates/
4. Verify.
mgf-test-supervisor --self-check # ~35s — canned pass/fail/hang/crash
mgf-test-supervisor --tier 0 # run your tier-0 smoke tests
If --self-check reports PASS, the supervisor + your project's config are wired correctly.
CLI
mgf-test-supervisor # run all tier 0-3 tests
mgf-test-supervisor --tier 0 # smoke only
mgf-test-supervisor --tier 0-2 # tiers 0, 1, 2
mgf-test-supervisor --budget 5m # greedy knapsack under 5min
mgf-test-supervisor --resume # resume the last interrupted run
mgf-test-supervisor --report # show last run's summary.md
mgf-test-supervisor --quarantine # show currently-quarantined tests
mgf-test-supervisor --self-check # canned scenarios — verifies the supervisor itself
mgf-test-supervisor --fail-fast # stop after first failing chunk
mtest is a short alias for the same binary.
State files
Everything under .gates/:
.gates/
last-run/
run.jsonl — one TestRecord per test outcome (append-only, JSONL)
summary.md — human-readable summary
failures/
<test_id>.txt — per-failure diagnostic bundle (one file = one failure)
junit-NNNN.xml — raw pytest XML per chunk (debugging aid)
state.json — resume checkpoint (atomic write)
durations.json — per-test duration history + recency score
quarantine.json — quarantine state (3-of-5-in / 3-consec-out)
Tier system
The federation 13-marker taxonomy maps onto 4 tiers (see TS-25 in mgf-common/docs/standards/TESTING.md §13.11). The --tier 0 selection runs only smoke-marked tests — the canonical "is the build broken?" signal.
| Tier | Wall-clock guideline | Markers |
|---|---|---|
| 0 | <30s total | smoke |
| 1 | <2min total | contract / property / regression / fuzz / integration |
| 2 | <5min total | concurrency / conformance / use_case |
| 3 | nightly / release | perf / stress / e2e / slow |
Federation alignment
mgf-test-supervisor is stdlib-only. It does not depend on mgf-common at runtime — per TS-23, the supervisor must run even when the project under test is broken.
The JSONL schema (TestRecord shape, MARKER_TIER mapping, SCHEMA_VERSION) is mirrored in mgf.common.testing for consumer-side parsing. Both surfaces are alignment-pinned by tests in mgf-common's tests/unit/testing/test_supervisor_alignment.py AND in mgf-test-supervisor's tests/unit/test_alignment_with_mgf_common.py. A field rename in one without the other fails CI on both.
License
MIT. See LICENSE.
Related federation libraries
- mgf-common — cornerstone (
mgf.common.testingcarries the consumer-readable schema) - mgf-fastapi, mgf-django, mgf-sqlalchemy, mgf-alembic, mgf-http, mgf-apiprobe — the 6 framework-shaped siblings, all adopters
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file mgf_test_supervisor-0.1.9.tar.gz.
File metadata
- Download URL: mgf_test_supervisor-0.1.9.tar.gz
- Upload date:
- Size: 126.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
51b35925569f5d02907085b172ae652eedf7558dd01e854b5dc6f70b91d46f95
|
|
| MD5 |
07f3b459ba653cebf40f513a163d6d07
|
|
| BLAKE2b-256 |
2290b65e0d2a7222d4e2cef6c77644f743b8628e04cc84bc783e947d808d5751
|
File details
Details for the file mgf_test_supervisor-0.1.9-py3-none-any.whl.
File metadata
- Download URL: mgf_test_supervisor-0.1.9-py3-none-any.whl
- Upload date:
- Size: 46.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4d5c80433815a117c5fb3cc0ac7d6f95b4ceed657d76de456d87c6b03534b267
|
|
| MD5 |
6bab4160bed1b4bcfde5b7ab24955991
|
|
| BLAKE2b-256 |
55a3ed6644bb540e56cb31bfe207a847d6c8100d8649cbfca53ef6630427b84e
|