Skip to main content

Budget-aware, crash-survivable, resumable test supervisor that drives pytest from the outside. Federation-wide sibling of mgf-common under the mgf.* namespace.

Project description

mgf-test-supervisor

Budget-aware, crash-survivable, resumable test supervisor that drives pytest from the outside.

A sibling of mgf-common under the mgf.* namespace. Federation-standard test infrastructure (TS-22..TS-27).

Conformance: L2 per mgf-common/docs/standards/. Tracker: docs/inprogress/MGF_STANDARDS_CONFORMANCE.md. Runtime LG / CF / OB rules declined per the TS-23 stdlib-only constraint — full rationale in §3 of the tracker.

Why a supervisor (not a pytest plugin)

A pytest plugin lives inside pytest. If pytest crashes — interpreter deadlock, segfault in a C extension, a hook that infinite-loops — the plugin dies with it. The supervisor runs pytest as a subprocess. A pytest crash becomes just one more outcome to record.

What it gives you:

  • Hang detection. A per-test pytest-timeout (in-process) backed by an outer wall-clock SIGTERM → SIGKILL (out-of-process). Tests that genuinely hang are reported hung, not "still running."
  • Crash isolation. One chunk's segfault doesn't take the run with it. Subsequent chunks run. The crashed chunk's in-flight test is recorded crashed.
  • Budget-aware selection. Pass --budget 5m and the supervisor packs a greedy knapsack of tests under that wall-clock, biased to tier-0 + recently-failing tests.
  • Resumability. A run interrupted by SIGINT (or by your laptop sleeping) can be resumed with --resume. Already-completed tests aren't repeated.
  • Quarantine. Flaky tests auto-quarantine after 3-of-5 fails. They evict after 3 consecutive passes. No more "this test is flaky, ignore it" Slack pings.
  • Per-failure diagnostic bundles. Each failure writes one self-contained .txt under .gates/last-run/failures/ — pytest stdout/stderr, traceback, durations, environment fingerprint, daemon state (if configured). One file = one diagnostic.
  • JSONL outcomes. .gates/last-run/run.jsonl is append-only and parseable. Read via mgf.common.testing.TestRecord.from_jsonl_line(...) or any JSON-aware tool.

Install

uv add --dev mgf-test-supervisor   # once published to PyPI

Until PyPI publish, install from Codeberg:

uv add --dev "mgf-test-supervisor @ git+ssh://git@codeberg.org/magogi-admin/mgf-test-supervisor.git"

You also need pytest-timeout for the per-test timeout that backs the supervisor's outer wall-clock:

uv add --dev "pytest-timeout>=2.3,<3"

Adopting in your project

Three steps:

1. Install the dep (above) + pytest-timeout.

2. Add the pytest config to your pyproject.toml.

[tool.pytest.ini_options]
timeout = 30
timeout_method = "thread"
faulthandler_timeout = 60
markers = [
    "smoke: minimal post-install sanity (fast; no fixtures); tier-0",
    "e2e: end-to-end test exercising bootstrap + multiple subsystems; tier-3",
    "integration: cross-module integration test (slower; may launch real subprocesses)",
    "contract: wire-format + SDK-surface pins; tier-1",
    "property: invariant-under-all-inputs test (Hypothesis); tier-1",
    "regression: pin for a specific past bug; docstring cites the fixing commit; tier-1",
    "fuzz: random / generated / hostile input — asserts safety properties; tier-1",
    "concurrency: real-threads contention test; tier-2",
    "conformance: design-promise tests (G/S guarantees); release-gated; tier-2",
    "use_case: reproduces a consumer scenario from FEEDBACK paper or recipe doc; tier-2",
    "perf: performance pin — pins a hot path's per-call cost (PF-* + TS-14). tier-3",
    "stress: sustained high-load test; tier-3; nightly only",
    "slow: legacy marker for tests >1s wall-clock (TS-18). Deprecated. Removed in v2.0.",
]

3. Add .gates/ to .gitignore (the supervisor's run state lives there).

.gates/

4. Verify.

mgf-test-supervisor --self-check   # ~35s — canned pass/fail/hang/crash
mgf-test-supervisor --tier 0       # run your tier-0 smoke tests

If --self-check reports PASS, the supervisor + your project's config are wired correctly.

CLI

mgf-test-supervisor                       # run all tier 0-3 tests
mgf-test-supervisor --tier 0              # smoke only
mgf-test-supervisor --tier 0-2            # tiers 0, 1, 2
mgf-test-supervisor --budget 5m           # greedy knapsack under 5min
mgf-test-supervisor --resume              # resume the last interrupted run
mgf-test-supervisor --report              # show last run's summary.md
mgf-test-supervisor --quarantine          # show currently-quarantined tests
mgf-test-supervisor --self-check          # canned scenarios — verifies the supervisor itself
mgf-test-supervisor --fail-fast           # stop after first failing chunk

mtest is a short alias for the same binary.

State files

Everything under .gates/:

.gates/
  last-run/
    run.jsonl              — one TestRecord per test outcome (append-only, JSONL)
    summary.md             — human-readable summary
    failures/
      <test_id>.txt        — per-failure diagnostic bundle (one file = one failure)
    junit-NNNN.xml         — raw pytest XML per chunk (debugging aid)
  state.json               — resume checkpoint (atomic write)
  durations.json           — per-test duration history + recency score
  quarantine.json          — quarantine state (3-of-5-in / 3-consec-out)

Tier system

The federation 13-marker taxonomy maps onto 4 tiers (see TS-25 in mgf-common/docs/standards/TESTING.md §13.11). The --tier 0 selection runs only smoke-marked tests — the canonical "is the build broken?" signal.

Tier Wall-clock guideline Markers
0 <30s total smoke
1 <2min total contract / property / regression / fuzz / integration
2 <5min total concurrency / conformance / use_case
3 nightly / release perf / stress / e2e / slow

Federation alignment

mgf-test-supervisor is stdlib-only. It does not depend on mgf-common at runtime — per TS-23, the supervisor must run even when the project under test is broken.

The JSONL schema (TestRecord shape, MARKER_TIER mapping, SCHEMA_VERSION) is mirrored in mgf.common.testing for consumer-side parsing. Both surfaces are alignment-pinned by tests in mgf-common's tests/unit/testing/test_supervisor_alignment.py AND in mgf-test-supervisor's tests/unit/test_alignment_with_mgf_common.py. A field rename in one without the other fails CI on both.

License

MIT. See LICENSE.

Related federation libraries

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mgf_test_supervisor-0.1.7.tar.gz (117.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mgf_test_supervisor-0.1.7-py3-none-any.whl (41.9 kB view details)

Uploaded Python 3

File details

Details for the file mgf_test_supervisor-0.1.7.tar.gz.

File metadata

  • Download URL: mgf_test_supervisor-0.1.7.tar.gz
  • Upload date:
  • Size: 117.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.8 {"installer":{"name":"uv","version":"0.10.8","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for mgf_test_supervisor-0.1.7.tar.gz
Algorithm Hash digest
SHA256 e1a8e33ca068a8808032f22da24503588a4749f27a982edc0ae7971578aff01f
MD5 0973e0d2e07c1b05ace6648748b0bdd9
BLAKE2b-256 4bcd375407ceb97af2090323903de2d1ca37dfb1b09f9594fbe46e444e68f512

See more details on using hashes here.

File details

Details for the file mgf_test_supervisor-0.1.7-py3-none-any.whl.

File metadata

  • Download URL: mgf_test_supervisor-0.1.7-py3-none-any.whl
  • Upload date:
  • Size: 41.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.8 {"installer":{"name":"uv","version":"0.10.8","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for mgf_test_supervisor-0.1.7-py3-none-any.whl
Algorithm Hash digest
SHA256 e360d4bbca28ee153bbb6526e31ff4b29f2139c908be89f41d2c573ba37c60f9
MD5 ba885248361caab7844493dd0ff7df55
BLAKE2b-256 baf356c0623dbe7d28d0ba9235fae10423627aa658590f51ab66bc48fa726fec

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page