A reliability harness for scheduled/autonomous agent systems: persist state across runs, prove liveness not just freshness, fail loud, and scan for the failure class before it ships.
Project description
agentliveness
Autonomous AI systems fail silently — fresh output over a dead engine. Predict less, detect more.
A small, dependency-free reliability harness for scheduled / autonomous agent systems. It makes the failure mode that birthed it impossible to ship silently: state that looks alive but resets every run.
The failure it prevents
Under a scheduler (launchd, cron, k8s CronJob) every run is a fresh process.
Any counter or accumulator held only in memory — initialized in __init__, never
restored from disk — silently resets to its starting value on every run. The code
reads correctly. In-process tests pass (one long-lived interpreter hides the bug).
In production it is dead, and nothing tells you.
This is real: a network monitor's adaptive-cadence counter was always 0 at
decision time because each scheduled process started fresh, so the adaptive
behaviour never engaged — invisible for weeks behind green tests and a fresh
timestamp.
Phase 1 — PersistentState
Restart-safe state, durable by construction:
from agentliveness import PersistentState
st = PersistentState("~/.myengine-state.json", default={"runs": 0})
data = st.load()
data["runs"] += 1
st.save(data) # atomic; survives crash and process death
- Atomic writes (tmp +
os.replace) — a crash mid-save never leaves a torn file that loads as garbage but looks fine. - Versioned envelope +
generated_at— schema drift fails loud; staleness is checkable. - Load-or-default — a missing or corrupt file recovers as "first run" instead of crashing the engine.
The test that is the thesis
tests/test_restart.py does not re-instantiate an object in one interpreter — it
spawns real subprocesses that increment, save, and exit, proving the counter
accumulates 0→1→2→3 across genuine process death. A test that runs in-process
would pass even with the bug; this one reproduces the production execution model.
pytest
Phase 2 — LivenessContract
Freshness asks "was this written recently?" Liveness asks "is the thing that writes it actually working?" — strictly stronger. A fresh file over a dead producer passes a freshness check and fails a liveness contract.
from agentliveness import LivenessContract
c = LivenessContract(
path="~/.myengine.json",
max_age_s=2 * 3600,
producing=lambda payload: bool(payload.get("norms")), # the 'actually working' signal
)
v = c.evaluate()
if not v.healthy:
alert(v.reason) # e.g. "fresh but EMPTY — producer emits nothing"
Bundles four invariants — exists · fresh · non-empty · producing — into one
verdict. A freshness-only monitor calls a fresh-but-empty file healthy; this
catches it. Warmup-honest: an unwarmed subsystem reports warming, not
degraded, so first boot does not cry wolf. Reads a PersistentState envelope
or bare JSON — the two primitives compose.
Phase 3 — LoudFail
Detection without a response channel is an incident no one sees. LoudFail
routes a verdict to sinks (log / macOS notification / exit code) — but only on a
state transition, so a scheduled check that is still-down stays silent
instead of training you to mute it. And a sink that throws (a broken notifier) is
swallowed: it can never crash the run it is protecting.
from agentliveness import LoudFail, log_sink, notify_sink
lf = LoudFail("my-agent", "~/.my-agent-loudfail.json",
sinks=[log_sink(), notify_sink()])
lf.report(contract.evaluate()) # fires once on down-edge + once on recovery; never raises
Incident state is persisted (via PersistentState), so "new incident" is judged
across scheduled processes — the three primitives compose into one harness.
Phase 4 — agentliveness audit
Catch the bug before it ships. A static scan (stdlib ast only) that flags
the never-restored-accumulator class in any agent repo: an attribute seeded as an
empty counter / list / dict / set in __init__, grown across calls, but never
restored from disk — alive in a long-lived process, silently reset every run
under a scheduler.
agentliveness audit path/to/agent # scan a file or directory
agentliveness audit . --json # machine-readable (for an empirical scan)
agentliveness audit . --exit-zero # report only; don't fail CI
⚠ engine.py:3 AdaptiveMonitor.consecutive_quiet (counter)
self.consecutive_quiet is seeded as an empty counter in __init__ and
accumulated (line 6), but never restored from persistence. Under a scheduler
every run is a fresh process, so it resets to its seed value on every run.
Exit code is non-zero when findings exist, so it gates CI. It's advisory —
each finding is a candidate to either restore (PersistentState) or confirm
intentionally per-run. It is also a Python API: from agentliveness import audit_path.
Roadmap
- Phase 1:
PersistentState+ the subprocess restart test. ✅ - Phase 2:
LivenessContract— producing, not just fresh. ✅ - Phase 3:
LoudFail— fire once per incident, never crash the run. ✅ - Phase 4:
agentliveness audit— static scan for the failure class in any repo. ✅
MIT licensed.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file agentliveness-0.4.0.tar.gz.
File metadata
- Download URL: agentliveness-0.4.0.tar.gz
- Upload date:
- Size: 19.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e449f60df3c22ddd7fc7b612b9b2bdf5535c9178a70ad092c0704dd9dd53cda1
|
|
| MD5 |
60ac465cfd5e62b94375b4d731315e41
|
|
| BLAKE2b-256 |
8910117422e6ae1caa70bef9593618eabc5105ae225f23ab0f8e34947ba995da
|
Provenance
The following attestation bundles were made for agentliveness-0.4.0.tar.gz:
Publisher:
release.yml on anandsureshworks/agentliveness
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
agentliveness-0.4.0.tar.gz -
Subject digest:
e449f60df3c22ddd7fc7b612b9b2bdf5535c9178a70ad092c0704dd9dd53cda1 - Sigstore transparency entry: 1952591209
- Sigstore integration time:
-
Permalink:
anandsureshworks/agentliveness@e058d34ad48daa9d93afc26bb16e2094bd40ea4c -
Branch / Tag:
refs/tags/v0.4.0 - Owner: https://github.com/anandsureshworks
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@e058d34ad48daa9d93afc26bb16e2094bd40ea4c -
Trigger Event:
release
-
Statement type:
File details
Details for the file agentliveness-0.4.0-py3-none-any.whl.
File metadata
- Download URL: agentliveness-0.4.0-py3-none-any.whl
- Upload date:
- Size: 16.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2849c2937c384c8e1b36386bf20123a6ed0d796d7a822e8dc3b66ac3a5a0f524
|
|
| MD5 |
10c60f0fe4b47682eaf51720baed4b8e
|
|
| BLAKE2b-256 |
c738af182513374d84e780e2c7bb4be3c9087cb1603d1fbbc38fcec7969b9a92
|
Provenance
The following attestation bundles were made for agentliveness-0.4.0-py3-none-any.whl:
Publisher:
release.yml on anandsureshworks/agentliveness
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
agentliveness-0.4.0-py3-none-any.whl -
Subject digest:
2849c2937c384c8e1b36386bf20123a6ed0d796d7a822e8dc3b66ac3a5a0f524 - Sigstore transparency entry: 1952591429
- Sigstore integration time:
-
Permalink:
anandsureshworks/agentliveness@e058d34ad48daa9d93afc26bb16e2094bd40ea4c -
Branch / Tag:
refs/tags/v0.4.0 - Owner: https://github.com/anandsureshworks
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@e058d34ad48daa9d93afc26bb16e2094bd40ea4c -
Trigger Event:
release
-
Statement type: