Skip to main content

A reliability harness for scheduled/autonomous agent systems: persist state across runs, prove liveness not just freshness, fail loud, and scan for the failure class before it ships.

Project description

agentliveness

Autonomous AI systems fail silently — fresh output over a dead engine. Predict less, detect more.

A small, dependency-free reliability harness for scheduled / autonomous agent systems. It makes the failure mode that birthed it impossible to ship silently: state that looks alive but resets every run.

The failure it prevents

Under a scheduler (launchd, cron, k8s CronJob) every run is a fresh process. Any counter or accumulator held only in memory — initialized in __init__, never restored from disk — silently resets to its starting value on every run. The code reads correctly. In-process tests pass (one long-lived interpreter hides the bug). In production it is dead, and nothing tells you.

This is real: a network monitor's adaptive-cadence counter was always 0 at decision time because each scheduled process started fresh, so the adaptive behaviour never engaged — invisible for weeks behind green tests and a fresh timestamp.

Phase 1 — PersistentState

Restart-safe state, durable by construction:

from agentliveness import PersistentState

st = PersistentState("~/.myengine-state.json", default={"runs": 0})
data = st.load()
data["runs"] += 1
st.save(data)        # atomic; survives crash and process death
  • Atomic writes (tmp + os.replace) — a crash mid-save never leaves a torn file that loads as garbage but looks fine.
  • Versioned envelope + generated_at — schema drift fails loud; staleness is checkable.
  • Load-or-default — a missing or corrupt file recovers as "first run" instead of crashing the engine.

The test that is the thesis

tests/test_restart.py does not re-instantiate an object in one interpreter — it spawns real subprocesses that increment, save, and exit, proving the counter accumulates 0→1→2→3 across genuine process death. A test that runs in-process would pass even with the bug; this one reproduces the production execution model.

pytest

Phase 2 — LivenessContract

Freshness asks "was this written recently?" Liveness asks "is the thing that writes it actually working?" — strictly stronger. A fresh file over a dead producer passes a freshness check and fails a liveness contract.

from agentliveness import LivenessContract

c = LivenessContract(
    path="~/.myengine.json",
    max_age_s=2 * 3600,
    producing=lambda payload: bool(payload.get("norms")),  # the 'actually working' signal
)
v = c.evaluate()
if not v.healthy:
    alert(v.reason)        # e.g. "fresh but EMPTY — producer emits nothing"

Bundles four invariants — exists · fresh · non-empty · producing — into one verdict. A freshness-only monitor calls a fresh-but-empty file healthy; this catches it. Warmup-honest: an unwarmed subsystem reports warming, not degraded, so first boot does not cry wolf. Reads a PersistentState envelope or bare JSON — the two primitives compose.

Phase 3 — LoudFail

Detection without a response channel is an incident no one sees. LoudFail routes a verdict to sinks (log / macOS notification / exit code) — but only on a state transition, so a scheduled check that is still-down stays silent instead of training you to mute it. And a sink that throws (a broken notifier) is swallowed: it can never crash the run it is protecting.

from agentliveness import LoudFail, log_sink, notify_sink

lf = LoudFail("my-agent", "~/.my-agent-loudfail.json",
              sinks=[log_sink(), notify_sink()])
lf.report(contract.evaluate())   # fires once on down-edge + once on recovery; never raises

Incident state is persisted (via PersistentState), so "new incident" is judged across scheduled processes — the three primitives compose into one harness.

Phase 4 — agentliveness audit

Catch the bug before it ships. A static scan (stdlib ast only) that flags the never-restored-accumulator class in any agent repo: an attribute seeded as an empty counter / list / dict / set in __init__, grown across calls, but never restored from disk — alive in a long-lived process, silently reset every run under a scheduler.

agentliveness audit path/to/agent           # scan a file or directory
agentliveness audit . --json                # machine-readable (for an empirical scan)
agentliveness audit . --exit-zero           # report only; don't fail CI
⚠ engine.py:3  AdaptiveMonitor.consecutive_quiet  (counter)
  self.consecutive_quiet is seeded as an empty counter in __init__ and
  accumulated (line 6), but never restored from persistence. Under a scheduler
  every run is a fresh process, so it resets to its seed value on every run.

Exit code is non-zero when findings exist, so it gates CI. It's advisory — each finding is a candidate to either restore (PersistentState) or confirm intentionally per-run. It is also a Python API: from agentliveness import audit_path.

Roadmap

  • Phase 1: PersistentState + the subprocess restart test. ✅
  • Phase 2: LivenessContract — producing, not just fresh. ✅
  • Phase 3: LoudFail — fire once per incident, never crash the run. ✅
  • Phase 4: agentliveness audit — static scan for the failure class in any repo. ✅

MIT licensed.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agentliveness-0.4.0.tar.gz (19.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

agentliveness-0.4.0-py3-none-any.whl (16.0 kB view details)

Uploaded Python 3

File details

Details for the file agentliveness-0.4.0.tar.gz.

File metadata

  • Download URL: agentliveness-0.4.0.tar.gz
  • Upload date:
  • Size: 19.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for agentliveness-0.4.0.tar.gz
Algorithm Hash digest
SHA256 e449f60df3c22ddd7fc7b612b9b2bdf5535c9178a70ad092c0704dd9dd53cda1
MD5 60ac465cfd5e62b94375b4d731315e41
BLAKE2b-256 8910117422e6ae1caa70bef9593618eabc5105ae225f23ab0f8e34947ba995da

See more details on using hashes here.

Provenance

The following attestation bundles were made for agentliveness-0.4.0.tar.gz:

Publisher: release.yml on anandsureshworks/agentliveness

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file agentliveness-0.4.0-py3-none-any.whl.

File metadata

  • Download URL: agentliveness-0.4.0-py3-none-any.whl
  • Upload date:
  • Size: 16.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for agentliveness-0.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 2849c2937c384c8e1b36386bf20123a6ed0d796d7a822e8dc3b66ac3a5a0f524
MD5 10c60f0fe4b47682eaf51720baed4b8e
BLAKE2b-256 c738af182513374d84e780e2c7bb4be3c9087cb1603d1fbbc38fcec7969b9a92

See more details on using hashes here.

Provenance

The following attestation bundles were made for agentliveness-0.4.0-py3-none-any.whl:

Publisher: release.yml on anandsureshworks/agentliveness

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page