Skip to main content

Budget-aware, crash-survivable, resumable test supervisor that drives pytest from the outside. Federation-wide sibling of mgf-common under the mgf.* namespace.

Project description

mgf-test-supervisor

Budget-aware, crash-survivable, resumable test supervisor that drives pytest from the outside.

A sibling of mgf-common under the mgf.* namespace. Federation-standard test infrastructure (TS-22..TS-27).

Conformance: L2 per mgf-common/docs/standards/. Tracker: docs/inprogress/MGF_STANDARDS_CONFORMANCE.md. Runtime LG / CF / OB rules declined per the TS-23 stdlib-only constraint — full rationale in §3 of the tracker.

Why a supervisor (not a pytest plugin)

A pytest plugin lives inside pytest. If pytest crashes — interpreter deadlock, segfault in a C extension, a hook that infinite-loops — the plugin dies with it. The supervisor runs pytest as a subprocess. A pytest crash becomes just one more outcome to record.

What it gives you:

  • Hang detection. A per-test pytest-timeout (in-process) backed by an outer wall-clock SIGTERM → SIGKILL (out-of-process). Tests that genuinely hang are reported hung, not "still running."
  • Crash isolation. One chunk's segfault doesn't take the run with it. Subsequent chunks run. The crashed chunk's in-flight test is recorded crashed.
  • Budget-aware selection. Pass --budget 5m and the supervisor packs a greedy knapsack of tests under that wall-clock, biased to tier-0 + recently-failing tests.
  • Resumability. A run interrupted by SIGINT (or by your laptop sleeping) can be resumed with --resume. Already-completed tests aren't repeated.
  • Quarantine. Flaky tests auto-quarantine after 3-of-5 fails. They evict after 3 consecutive passes. No more "this test is flaky, ignore it" Slack pings.
  • Per-failure diagnostic bundles. Each failure writes one self-contained .txt under .gates/last-run/failures/ — pytest stdout/stderr, traceback, durations, environment fingerprint, daemon state (if configured). One file = one diagnostic.
  • JSONL outcomes. .gates/last-run/run.jsonl is append-only and parseable. Read via mgf.common.testing.TestRecord.from_jsonl_line(...) or any JSON-aware tool.

Install

uv add --dev mgf-test-supervisor   # once published to PyPI

Until PyPI publish, install from Codeberg:

uv add --dev "mgf-test-supervisor @ git+ssh://git@codeberg.org/magogi-admin/mgf-test-supervisor.git"

You also need pytest-timeout for the per-test timeout that backs the supervisor's outer wall-clock:

uv add --dev "pytest-timeout>=2.3,<3"

Adopting in your project

Three steps:

1. Install the dep (above) + pytest-timeout.

2. Add the pytest config to your pyproject.toml.

[tool.pytest.ini_options]
timeout = 30
timeout_method = "thread"
faulthandler_timeout = 60
markers = [
    "smoke: minimal post-install sanity (fast; no fixtures); tier-0",
    "e2e: end-to-end test exercising bootstrap + multiple subsystems; tier-3",
    "integration: cross-module integration test (slower; may launch real subprocesses)",
    "contract: wire-format + SDK-surface pins; tier-1",
    "property: invariant-under-all-inputs test (Hypothesis); tier-1",
    "regression: pin for a specific past bug; docstring cites the fixing commit; tier-1",
    "fuzz: random / generated / hostile input — asserts safety properties; tier-1",
    "concurrency: real-threads contention test; tier-2",
    "conformance: design-promise tests (G/S guarantees); release-gated; tier-2",
    "use_case: reproduces a consumer scenario from FEEDBACK paper or recipe doc; tier-2",
    "perf: performance pin — pins a hot path's per-call cost (PF-* + TS-14). tier-3",
    "stress: sustained high-load test; tier-3; nightly only",
    "slow: legacy marker for tests >1s wall-clock (TS-18). Deprecated. Removed in v2.0.",
]

3. Add .gates/ to .gitignore (the supervisor's run state lives there).

.gates/

4. Verify.

mgf-test-supervisor --self-check   # ~35s — canned pass/fail/hang/crash
mgf-test-supervisor --tier 0       # run your tier-0 smoke tests

If --self-check reports PASS, the supervisor + your project's config are wired correctly.

CLI

mgf-test-supervisor                       # run all tier 0-3 tests
mgf-test-supervisor --tier 0              # smoke only
mgf-test-supervisor --tier 0-2            # tiers 0, 1, 2
mgf-test-supervisor --budget 5m           # greedy knapsack under 5min
mgf-test-supervisor --resume              # resume the last interrupted run
mgf-test-supervisor --report              # show last run's summary.md
mgf-test-supervisor --quarantine          # show currently-quarantined tests
mgf-test-supervisor --self-check          # canned scenarios — verifies the supervisor itself
mgf-test-supervisor --fail-fast           # stop after first failing chunk

mtest is a short alias for the same binary.

State files

Everything under .gates/:

.gates/
  last-run/
    run.jsonl              — one TestRecord per test outcome (append-only, JSONL)
    summary.md             — human-readable summary
    failures/
      <test_id>.txt        — per-failure diagnostic bundle (one file = one failure)
    junit-NNNN.xml         — raw pytest XML per chunk (debugging aid)
  state.json               — resume checkpoint (atomic write)
  durations.json           — per-test duration history + recency score
  quarantine.json          — quarantine state (3-of-5-in / 3-consec-out)

Tier system

The federation 13-marker taxonomy maps onto 4 tiers (see TS-25 in mgf-common/docs/standards/TESTING.md §13.11). The --tier 0 selection runs only smoke-marked tests — the canonical "is the build broken?" signal.

Tier Wall-clock guideline Markers
0 <30s total smoke
1 <2min total contract / property / regression / fuzz / integration
2 <5min total concurrency / conformance / use_case
3 nightly / release perf / stress / e2e / slow

Federation alignment

mgf-test-supervisor is stdlib-only. It does not depend on mgf-common at runtime — per TS-23, the supervisor must run even when the project under test is broken.

The JSONL schema (TestRecord shape, MARKER_TIER mapping, SCHEMA_VERSION) is mirrored in mgf.common.testing for consumer-side parsing. Both surfaces are alignment-pinned by tests in mgf-common's tests/unit/testing/test_supervisor_alignment.py AND in mgf-test-supervisor's tests/unit/test_alignment_with_mgf_common.py. A field rename in one without the other fails CI on both.

License

MIT. See LICENSE.

Related federation libraries

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mgf_test_supervisor-0.1.8.tar.gz (125.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mgf_test_supervisor-0.1.8-py3-none-any.whl (46.1 kB view details)

Uploaded Python 3

File details

Details for the file mgf_test_supervisor-0.1.8.tar.gz.

File metadata

  • Download URL: mgf_test_supervisor-0.1.8.tar.gz
  • Upload date:
  • Size: 125.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.8 {"installer":{"name":"uv","version":"0.10.8","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for mgf_test_supervisor-0.1.8.tar.gz
Algorithm Hash digest
SHA256 6efe6621749bc9763b43073ebead9dc2550d77e73f38be50c3f1a8d118b1e89c
MD5 caac7d86b665dffbb2b68c137d903b1d
BLAKE2b-256 114b03792522bff284e7af83221d5631a86406d2073b606e45a4bb1c418011c0

See more details on using hashes here.

File details

Details for the file mgf_test_supervisor-0.1.8-py3-none-any.whl.

File metadata

  • Download URL: mgf_test_supervisor-0.1.8-py3-none-any.whl
  • Upload date:
  • Size: 46.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.8 {"installer":{"name":"uv","version":"0.10.8","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for mgf_test_supervisor-0.1.8-py3-none-any.whl
Algorithm Hash digest
SHA256 05f0eeb16f282654e1451766bfb78b2127b4c504da1624118207b3902cc40532
MD5 7f689b6e433e406dcce5e6367fb03657
BLAKE2b-256 c3d0b192662893f1ec72415caae05e7c673fbaf65f2e9a2b3bfc26d9584b2eac

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page