morphsat

Constraint-control testbed for local LLM agents

These details have not been verified by PyPI

Project links

Project description

Evidence-based commit control, novelty response, and auditable execution traces.

What is this?

MorphSAT is a testbed for studying when a local LLM agent should stop gathering evidence and commit to an action.

An LLM agent with tool access can loop indefinitely — calling tools, reading results, calling more tools — without ever deciding. MorphSAT layers constraint-control mechanisms around the agent's loop and measures their effect on accuracy, tool usage, and commit timing.

The current checkpoint (v7) compares an evidence-pressure controller against an anticipatory posture controller that treats novelty as an orienting state rather than a scalar penalty.

For cognitive architecture researchers: See docs/COGNITIVE_ARCHITECTURE_TRANSLATION.md for a term mapping between MorphSAT internal names and concepts from Soar, ACT-R, and active inference.

The proof chain

Seven versions tested on a 20-scenario security alert triage benchmark:

Version	Mechanism	Accuracy	Key finding
v1	Static FSA constraints	55%	0 useful interventions — too weak
v2	Fixed tool-call counter	67.5%	Any pressure helps
v3	Adaptive budget (2/3/5)	55–67.5%	Ceiling irrelevant, floor matters
v4	Evidence-pressure gate	65%	Best escalation (77.8%), best pre-v7
v5	+ pattern memory	62.5%	Mechanics work, accuracy fails — learned threats without tolerance
v6	+ bidirectional pressure	55%	Novelty-as-penalty is the wrong abstraction
v7	Anticipatory posture control	70%	Best accuracy, benign recovery 78.6% (was 35.7%)

v7 result

	Evidence-pressure (v4)	Anticipatory posture (v7)
Overall accuracy	62.5%	70.0%
Tool-loop rate	35.0%	25.0%
Avg turns to decision	5.4	4.8
Benign accuracy	35.7%	78.6%
Suspicious accuracy	75.0%	62.5%
Escalation accuracy	77.8%	66.7%

v7 fixes the tolerance problem (benign +42.9pp) at the cost of suspicious/escalate regression. The tradeoff is real and not yet resolved.

Key insight: Novelty handling is a posture problem, not a threshold problem. When novelty was treated as a penalty (raise the commit threshold), the agent over-investigated benign scenarios and never learned tolerance. When novelty was treated as an orienting state (enter protective posture, gather bounded evidence, relax on safe evidence), benign recovery improved dramatically.

Architecture

Layer 1: FSA lifecycle gate
         Legal task-state transitions. Blocks impossible sequences.

Layer 2: Evidence-pressure gate (v4)
         Sensor-driven commit timing. Se complexity threshold,
         evidence quality, sidecar confidence, urgency decay.
         Fires irreversibly when pressure crosses threshold.

Layer 3: Anticipatory posture controller (v7)
         Hidden state machine wrapping the agent's loop.
         Novelty → ORIENT → bounded investigation → decide.
         Safe evidence decays protective posture (tolerance).
         Multi-axis pressure → escalation signal.

Layer 4: Dual-store memory
         Threat patterns and tolerance patterns stored separately.
         Familiarity with known configurations speeds future decisions.

Layer 5: Episodic traces
         Turn-by-turn audit records of state, evidence, posture,
         and outcomes. Every decision is reproducible.

Shadow monitor states

NORMAL ──→ ORIENTING ──→ SAFE_DISTANCE ──→ NORMAL (safe recovery)
              │
              └──→ INVESTIGATING ──→ COMMIT_READY (clear evidence)
                        │            ESCALATE_READY (high threat)
                        │            ABSTAIN_READY (contradictory)
                        └──→ SWARM_CALL (multi-axis pressure)

Install

pip install morphsat

Or from source:

git clone https://github.com/echo313unfolding/MorphSAT.git
cd MorphSAT
pip install -e ".[dev]"

Quick start

FSA lifecycle gate

from morphsat import MorphSATGate, TaskState, TaskEvent

gate = MorphSATGate()
state, legal, action = gate.step(TaskEvent.NEW_TASK)
assert state == TaskState.PLANNING
assert legal is True

Evidence-pressure gate

from morphsat import CommitGate, SplitMemoryStore

memory = SplitMemoryStore("/tmp/memory.json")
gate = CommitGate(memory=memory)
gate.initialize(alert_text="Suspicious process spawned by cron")

# Feed tool results
action = gate.process_evidence("check_process", "PID 1234: /usr/bin/curl ...")
# action.action is "CONTINUE", "COMMIT", or "ABSTAIN"
# action.direction is "escalate", "benign", or None

Shadow monitor (v7)

from morphsat import ShadowMonitor, SplitMemoryStore

memory = SplitMemoryStore("/tmp/memory.json")
monitor = ShadowMonitor(memory=memory)
monitor.initialize(alert_text="Unknown binary in /tmp")

# Monitor enters ORIENT if alert is novel
print(monitor.state)  # ShadowState.ORIENTING

# Feed evidence — monitor transitions through states
action = monitor.process_evidence("check_hash", "Hash not in VirusTotal")
print(monitor.state)       # ShadowState.INVESTIGATING
print(action.action)       # "CONTINUE"

action = monitor.process_evidence("check_parent", "Parent: systemd")
print(monitor.state)       # ShadowState.COMMIT_READY
print(action.action)       # "COMMIT"
print(action.direction)    # "benign"

# Close episode — updates memory for next run
monitor.close_episode("benign", confidence=0.8)

Project structure

morphsat/
├── morphsat/
│   ├── __init__.py           # Public API
│   ├── core.py               # FSA gate, TaskState/TaskEvent, classify_event
│   ├── token.py              # Token adjacency scoring (4-lane structure)
│   ├── pressure_gate.py      # v4 evidence-pressure gate
│   ├── commit_gate.py        # v6 bidirectional commit gate + split memory
│   ├── shadow_monitor.py     # v7 anticipatory posture controller
│   └── receipt.py            # Receipt wrapping with SHA256 content hash
├── tests/
│   ├── test_core.py          # 31 tests: FSA structure, transitions, receipts
│   ├── test_token.py         # 22 tests: lane scoring, temperature, masking
│   └── test_shadow_monitor.py # 22 tests: v7 posture predictions
├── docs/
│   ├── PRESSURE_GATE_SPEC.md
│   └── COGNITIVE_ARCHITECTURE_TRANSLATION.md
├── receipts/
│   └── v7_shadow_monitor/    # Benchmark receipts (single-seed + 3-seed)
└── pyproject.toml

109/109 tests passing (Python 3.10).

Caveats

N=20 scenario benchmark with simulated tool responses
Temperature=0 (deterministic) — no stochastic variance across seeds
Qwen2.5-Coder-3B doing security triage — small model, not its strongest domain
The shadow monitor is tested on one task type (alert triage)
This is a research testbed, not a production system

Companion projects

Project	Description
helix-substrate	Calibration-free neural network compression (HXQ).
sentinel-hybrid-stack	Hybrid SSM-Transformer security monitoring pipeline.
helix-codec	Standalone C99 tensor codec library.

License

MIT — see LICENSE.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.4.0

May 8, 2026

0.3.1

May 7, 2026

0.3.0

May 7, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

morphsat-0.4.0.tar.gz (48.5 kB view details)

Uploaded May 8, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

morphsat-0.4.0-py3-none-any.whl (36.5 kB view details)

Uploaded May 8, 2026 Python 3

File details

Details for the file morphsat-0.4.0.tar.gz.

File metadata

Download URL: morphsat-0.4.0.tar.gz
Upload date: May 8, 2026
Size: 48.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for morphsat-0.4.0.tar.gz
Algorithm	Hash digest
SHA256	`096088f7109884846da068e6714892f630841eb0aa0b89f8ba2086223c7661ac`
MD5	`0279311bf4277fd555f51ef98f816d7c`
BLAKE2b-256	`6c8ad3cc2c554d11eb010e12ef61a6a9b0f7c48973f9502f466dad5565938ad3`

See more details on using hashes here.

File details

Details for the file morphsat-0.4.0-py3-none-any.whl.

File metadata

Download URL: morphsat-0.4.0-py3-none-any.whl
Upload date: May 8, 2026
Size: 36.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for morphsat-0.4.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`989762a1cff826895fa98c97fdf73e4d2a2b9e8489430101c6606d1a18ed619d`
MD5	`02ae32f9b8b9306375d931cabe30de5c`
BLAKE2b-256	`b05009d218bcff5f5c463f5d4913f3afb37810476dd6fcbf20263448763413ce`

See more details on using hashes here.

morphsat 0.4.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

What is this?

The proof chain

v7 result

Architecture

Shadow monitor states

Install

Quick start

FSA lifecycle gate

Evidence-pressure gate

Shadow monitor (v7)

Project structure

Caveats

Companion projects

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes