Skip to main content

Budget-aware, crash-survivable, resumable test supervisor that drives pytest from the outside. Federation-wide sibling of mgf-common under the mgf.* namespace.

Project description

mgf-test-supervisor

Budget-aware, crash-survivable, resumable test supervisor that drives pytest from the outside.

A sibling of mgf-common under the mgf.* namespace. Federation-standard test infrastructure (TS-22..TS-27).

Why a supervisor (not a pytest plugin)

A pytest plugin lives inside pytest. If pytest crashes — interpreter deadlock, segfault in a C extension, a hook that infinite-loops — the plugin dies with it. The supervisor runs pytest as a subprocess. A pytest crash becomes just one more outcome to record.

What it gives you:

  • Hang detection. A per-test pytest-timeout (in-process) backed by an outer wall-clock SIGTERM → SIGKILL (out-of-process). Tests that genuinely hang are reported hung, not "still running."
  • Crash isolation. One chunk's segfault doesn't take the run with it. Subsequent chunks run. The crashed chunk's in-flight test is recorded crashed.
  • Budget-aware selection. Pass --budget 5m and the supervisor packs a greedy knapsack of tests under that wall-clock, biased to tier-0 + recently-failing tests.
  • Resumability. A run interrupted by SIGINT (or by your laptop sleeping) can be resumed with --resume. Already-completed tests aren't repeated.
  • Quarantine. Flaky tests auto-quarantine after 3-of-5 fails. They evict after 3 consecutive passes. No more "this test is flaky, ignore it" Slack pings.
  • Per-failure diagnostic bundles. Each failure writes one self-contained .txt under .gates/last-run/failures/ — pytest stdout/stderr, traceback, durations, environment fingerprint, daemon state (if configured). One file = one diagnostic.
  • JSONL outcomes. .gates/last-run/run.jsonl is append-only and parseable. Read via mgf.common.testing.TestRecord.from_jsonl_line(...) or any JSON-aware tool.

Install

uv add --dev mgf-test-supervisor   # once published to PyPI

Until PyPI publish, install from Codeberg:

uv add --dev "mgf-test-supervisor @ git+ssh://git@codeberg.org/magogi-admin/mgf-test-supervisor.git"

You also need pytest-timeout for the per-test timeout that backs the supervisor's outer wall-clock:

uv add --dev "pytest-timeout>=2.3,<3"

Adopting in your project

Three steps:

1. Install the dep (above) + pytest-timeout.

2. Add the pytest config to your pyproject.toml.

[tool.pytest.ini_options]
timeout = 30
timeout_method = "thread"
faulthandler_timeout = 60
markers = [
    "smoke: minimal post-install sanity (fast; no fixtures); tier-0",
    "e2e: end-to-end test exercising bootstrap + multiple subsystems; tier-3",
    "integration: cross-module integration test (slower; may launch real subprocesses)",
    "contract: wire-format + SDK-surface pins; tier-1",
    "property: invariant-under-all-inputs test (Hypothesis); tier-1",
    "regression: pin for a specific past bug; docstring cites the fixing commit; tier-1",
    "fuzz: random / generated / hostile input — asserts safety properties; tier-1",
    "concurrency: real-threads contention test; tier-2",
    "conformance: design-promise tests (G/S guarantees); release-gated; tier-2",
    "use_case: reproduces a consumer scenario from FEEDBACK paper or recipe doc; tier-2",
    "perf: performance pin — pins a hot path's per-call cost (PF-* + TS-14). tier-3",
    "stress: sustained high-load test; tier-3; nightly only",
    "slow: legacy marker for tests >1s wall-clock (TS-18). Deprecated. Removed in v2.0.",
]

3. Add .gates/ to .gitignore (the supervisor's run state lives there).

.gates/

4. Verify.

mgf-test-supervisor --self-check   # ~35s — canned pass/fail/hang/crash
mgf-test-supervisor --tier 0       # run your tier-0 smoke tests

If --self-check reports PASS, the supervisor + your project's config are wired correctly.

CLI

mgf-test-supervisor                       # run all tier 0-3 tests
mgf-test-supervisor --tier 0              # smoke only
mgf-test-supervisor --tier 0-2            # tiers 0, 1, 2
mgf-test-supervisor --budget 5m           # greedy knapsack under 5min
mgf-test-supervisor --resume              # resume the last interrupted run
mgf-test-supervisor --report              # show last run's summary.md
mgf-test-supervisor --quarantine          # show currently-quarantined tests
mgf-test-supervisor --self-check          # canned scenarios — verifies the supervisor itself
mgf-test-supervisor --fail-fast           # stop after first failing chunk

mtest is a short alias for the same binary.

State files

Everything under .gates/:

.gates/
  last-run/
    run.jsonl              — one TestRecord per test outcome (append-only, JSONL)
    summary.md             — human-readable summary
    failures/
      <test_id>.txt        — per-failure diagnostic bundle (one file = one failure)
    junit-NNNN.xml         — raw pytest XML per chunk (debugging aid)
  state.json               — resume checkpoint (atomic write)
  durations.json           — per-test duration history + recency score
  quarantine.json          — quarantine state (3-of-5-in / 3-consec-out)

Tier system

The federation 13-marker taxonomy maps onto 4 tiers (see TS-25 in mgf-common/docs/standards/TESTING.md §13.11). The --tier 0 selection runs only smoke-marked tests — the canonical "is the build broken?" signal.

Tier Wall-clock guideline Markers
0 <30s total smoke
1 <2min total contract / property / regression / fuzz / integration
2 <5min total concurrency / conformance / use_case
3 nightly / release perf / stress / e2e / slow

Federation alignment

mgf-test-supervisor is stdlib-only. It does not depend on mgf-common at runtime — per TS-23, the supervisor must run even when the project under test is broken.

The JSONL schema (TestRecord shape, MARKER_TIER mapping, SCHEMA_VERSION) is mirrored in mgf.common.testing for consumer-side parsing. Both surfaces are alignment-pinned by tests in mgf-common's tests/unit/testing/test_supervisor_alignment.py AND in mgf-test-supervisor's tests/unit/test_alignment_with_mgf_common.py. A field rename in one without the other fails CI on both.

License

MIT. See LICENSE.

Related federation libraries

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mgf_test_supervisor-0.1.4.tar.gz (83.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mgf_test_supervisor-0.1.4-py3-none-any.whl (40.3 kB view details)

Uploaded Python 3

File details

Details for the file mgf_test_supervisor-0.1.4.tar.gz.

File metadata

  • Download URL: mgf_test_supervisor-0.1.4.tar.gz
  • Upload date:
  • Size: 83.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.8 {"installer":{"name":"uv","version":"0.10.8","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for mgf_test_supervisor-0.1.4.tar.gz
Algorithm Hash digest
SHA256 f4216aaa49f7fa85aecbdc7be1a7b798fa6afe03c87ec4ae2468576e4de503c4
MD5 99d512b9c94db000e83309bbc4de8600
BLAKE2b-256 11a3943a7a257a5e9754e4b299726049df06119881ef15d6c176327ef95c37bb

See more details on using hashes here.

File details

Details for the file mgf_test_supervisor-0.1.4-py3-none-any.whl.

File metadata

  • Download URL: mgf_test_supervisor-0.1.4-py3-none-any.whl
  • Upload date:
  • Size: 40.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.8 {"installer":{"name":"uv","version":"0.10.8","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for mgf_test_supervisor-0.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 d902f251fef4258d99106d6aa596cf7dfceee1a95ee1c56a4bf6a82632dc4b67
MD5 210ea31127212dfd037ae89a33416113
BLAKE2b-256 2e60e476b1aaeabba6a8790d3e620dc457ca11361fcd73aea5fb0c084edbc7c3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page