An instrument for quantifying structural self-determination across systems.
Project description
autonometrics
Shadows of Self-Determination
An instrument for quantifying structural self-determination across systems.
We see shadows of autonomy, calibrated and reproducible. Whether the cave behind them holds one object or many, this tool does not decide.
Status: alpha — work in progress, API unstable.
What it is
autonometrics is a Python instrument that takes a discrete
trajectory — a cellular automaton, a Boolean network, an agent
log, a simulation, and as of v0.8.0a0 also a system that
publishes its own intended trajectory alongside its realised
one — and returns up to five normalised readings of how
self-determined its structure looks. Each reading comes from
a different scientific tradition; together they form a small
atlas of autonomy: a few charts that cover the same
territory from different operational angles.
It is a measurement tool, not a new theory of autonomy. The
package collects existing measures, normalises them to a
shared [0, 1] scale, and lets you compare points from very
different substrates in the same space.
The five axes
| Axis | Question it answers | Tradition | Reference |
|---|---|---|---|
closure |
How much of the system's information is generated from inside? | Information theory | Shannon (1948); Bertschinger et al. (2008); Albantakis (2021) |
memory |
How much of the system's predictability is carried by its own past? | Computational mechanics | Crutchfield & Young (1989); Feldman & Crutchfield (2002) |
constraint |
How tightly do the system's constraints enable each other? | Theoretical biology | Montévil & Mossio (2015) |
persistence |
How well does the system's structure resist a small perturbation? | Operational goal-directedness | Lee & McShea (2020) |
coherence |
How well does the system's executed trajectory follow its declared one? | Akrasia → cognitive dissonance → AI alignment | Festinger (1957); Sheeran (2002); Lanham (2023); Turpin (2023) |
All five readings live in [0, 1] and can be plotted,
correlated and compared across substrates that expose the
relevant capability. The first four require only a state /
environment trajectory; the fifth additionally requires a
parallel (declared, executed) pair, which only adapters
with an explicit declarative layer expose. Adapters that do
not implement that layer report coherence = None honestly,
in line with the same dropout policy already used for
constraint_closure (graph-only) and persistence
(replay-only).
What the project does not claim
The strongest possible reading — that the five autonomy
measures are the same quantity in different notations, the way
H_Shannon = S_stat / ln 2 collapses information-theoretic and
statistical entropy into one number — was falsified by the
second benchmark. The pairwise correlations observed across
v0.5.x — v0.7.x sit at +0.32 (closure-memory), -0.04
(closure-constraint), -0.57 (memory-constraint), -0.44
(closure-persistence), -0.38 (memory-persistence) and +0.05
(constraint-persistence): six pairs below saturation, six pieces
of evidence that we are not looking at one quantity. That is the
Level 1 reading and it is already dead.
The intermediate hypothesis — that there is one
multidimensional object and each axis is a different coordinate
of it, the way RGB and HSV are views of the same perceptual
colour space, or the Big Five personality dimensions are
factor-analytic coordinates of a non-scalar object — is what the
package implements and tests. The prediction is sharp: the
correlations should sit in a sweet spot, non-zero (they share an
object) and sub-saturating (they are not redundant). That is the
Level 2 reading, the one the v0.7.2a0 atlas-geometry
pre-registration put on trial.
The v0.8.0a0 cycle pulled the verdict downward. The fifth axis
(coherence, CBA / Theil's U) was added expecting to triangulate
the same underlying object from a sixth angle; instead the data
showed three things. The four prior pairwise correlations remain
below |r| < 0.7 (the axes still carry distinct information).
The fifth axis is empirically independent under causal control:
r(closure, coherence) falls from +0.97 to +0.48 when
PromisedCycle is driven by two independent sources of
variability rather than one, and r(coherence, p_env) ≈ 0
confirms the formula's predicted invariance to declarative-side
noise. And the full five-dimensional cloud does not exist on
the current zoo: n_valid_full = 0/645, because no adapter
exposes get_causal_graph and get_declared_executed
simultaneously.
The atlas therefore reads as a mosaic / archipelago of
overlapping four-axis sub-charts rather than a single
five-dimensional cloud. The underlying question — one
multidimensional object or many? — is pulled toward Level 3
by this cycle but not yet decided: structural geometry alone
cannot arbitrate it. Only the validation against behavioural
data, external to this repository, can — and that is deferred to
studies built on top of the LLM adapter shipped in v0.9.0a0.
So the package ships a measurement framework, a benchmark,
the dropouts, and a falsifiable working hypothesis — not a
definitive theory of autonomy. The full conceptual statement
and the falsification criteria live in
docs/PBA.md (English) and
docs/PBA.es.md (Spanish); per-axis design
notes in
docs/CONSTRAINT_CLOSURE.md,
docs/RAI.md and
docs/CBA.md; the release log in
CHANGELOG.md.
Swiss army knife or fruit salad? — Self-evaluation
A reasonable critic could ask: are these five axes one coherent instrument or a collection of unrelated metrics with a shared logo? This section answers honestly.
Why we believe it's a coherent instrument
- Unified mathematical shape: every axis lives in
[0, 1]with the same "internal / total" form. - Single protocol (
AutonomySystem); any substrate enters through one door. - Pre-registered falsification thresholds for every axis; the project is committed to being able to fail.
- Each axis is anchored in a published research tradition with at
least one explicit reference:
closurefrom Albantakis & Bertschinger (with Tononi's IIT lineage);memoryfrom Crutchfield's excess-entropy programme;constraintfrom Montévil & Mossio;persistencefrom Lee & McShea (with Deci & Ryan / SDT as the deferred behavioural reference);coherencefrom Festinger's cognitive-dissonance tradition (with the Chain-of-Thought faithfulness literature in AI alignment as the contemporary application). None is invented from scratch.
Where the critique has merit
- The five traditions behind the axes are genuinely different fields. Combining them is a methodological bet, not a settled fact.
- The v0.8.0a0 atlas-geometry verdict was "mosaic, not
manifold" (
n_valid_full = 0/645). The five axes do not jointly span a clean 5D cloud on the current zoo. rai_proxy_persistenceis a structural proxy whose strong validation against behavioural RAI is deferred to external studies. Until that happens, the "RAI" label is provisional.- Vocabulary like "PBA", "mosaic atlas" or "five-axis hole" is internal to this project and requires reading the docs to make sense.
- We do not yet have a peer-reviewed paper describing the package as a whole.
What this implies for the prospective user
If you work in:
- IIT / consciousness / structural autonomy → the axes you'll
recognise (
closure,constraint_closure) are well-implemented and well-grounded. - AI alignment / agentic LLMs →
coherence(CBA / Theil's U on declared vs executed trajectories) maps directly onto Chain-of-Thought faithfulness questions. - Pure SDT motivational research → treat
rai_proxy_persistenceas a structural candidate pending validation; do not equate it with C-RAI yet. - Cross-tradition synthesis → this is exactly the bet the package makes; you'll find the ingredients pre-assembled.
If none of the above fits your work, this package is probably not your tool — and we'd rather you know that in 90 seconds than after installing.
Installation
From PyPI (recommended)
pip install --pre autonometrics
For the optional plotting / benchmark dependencies:
pip install --pre "autonometrics[viz]"
The
--preflag is required while the package is inalpha. Once a non-alpha release is published, plainpip install autonometricswill be enough.
From source (for development)
git clone https://github.com/bugerchip/Autonometrics.git
cd Autonometrics
pip install -e ".[dev,viz]"
Requires Python 3.10 or later. The core package depends only on
numpy. The optional viz extra adds matplotlib, used by the
benchmark plotting scripts.
Quickstart
One-line measurement (recommended)
Since v0.8.1a0 the package exposes the canonical axis names
(closure, memory, constraint, persistence, coherence) and a
top-level measure() helper. The shortest possible end-to-end
measurement reads:
import autonometrics as anm
system = anm.PromisedCycle(length=600, period=4, alphabet=4, p_noise=0.1)
profile = anm.measure(system)
print(profile)
print(profile.to_dict())
print(profile.defined_axes())
anm.measure(system) defaults to all five canonical axes. Axes the
adapter does not support are reported as None (mosaic-dropout
policy) instead of aborting the measurement.
Asking for a subset of axes
import autonometrics as anm
profile = anm.measure(system, axes=["closure", "coherence"])
print(profile["closure"], profile["coherence"])
The full list of canonical axis names lives in anm.AXES:
>>> anm.AXES
('closure', 'memory', 'constraint', 'persistence', 'coherence')
Measuring a synthetic automaton
import autonometrics as anm
system_a = anm.SimpleAutomaton.demo(mode="self_generated")
system_b = anm.SimpleAutomaton.demo(mode="external")
profile_a = anm.measure(system_a, axes=["closure", "memory"])
profile_b = anm.measure(system_b, axes=["closure", "memory"])
print(profile_a.closure, profile_a.memory)
print(profile_b.closure, profile_b.memory)
The verbose constructors SimpleAutomaton(...),
SimpleAutomaton.from_self_generated_rules(...) and
SimpleAutomaton.from_external_rules(...) are still available when
you need to supply your own environment array or non-default noise.
Measuring a CSV trajectory you already have
import autonometrics as anm
trajectory = anm.CSVTrajectory.from_file("my_log.csv") # header: state,env
profile = anm.measure(trajectory, axes=["closure", "memory"])
CSVTrajectory.from_path(...) remains supported as a legacy alias.
my_log.csv is a two-column file with discrete integer labels:
state,env
0,1
2,1
2,0
1,0
Measuring an LLM transcript
import autonometrics as anm
adapter = anm.LLMTranscriptAdapter.from_jsonl("session.jsonl")
profile = anm.measure(adapter)
session.jsonl follows the standard OpenAI / Anthropic
Messages format every chat-style provider already emits.
One JSON object per line, no proprietary schema:
{"role": "user", "content": "Read the file then summarize."}
{"role": "assistant", "reasoning": "I will read the file first.", "tool_calls": [{"type": "function", "function": {"name": "read_file"}}]}
{"role": "tool", "content": "<contents of file>"}
{"role": "assistant", "reasoning": "Now I summarize.", "content": "Summary: ..."}
The adapter is off-line: it consumes a recorded transcript.
It therefore enables three of the five axes and reports None
for the other two under the package's mosaic-dropout policy:
| Axis | Status | Reason |
|---|---|---|
closure |
✓ | Reads state / env from the assistant turn stream. |
memory |
✓ | Same; needs corpus length ≥ 500 turns. |
coherence |
✓ | Reads (declared, executed) from reasoning and tool_calls. |
constraint |
None | A transcript does not expose the model's internal causal graph. |
persistence |
None | Off-line cannot replay the model from a perturbed state. |
For a live-API counterpart that additionally enables
persistence (single-token / single-message perturbations
replayed through the live endpoint), see the planned
LLMLiveAdapter in a later alpha. The full design contract
of the off-line adapter — input schema, field-to-axis
mapping, discretisation policy, multi-session handling and
validation boundary — lives in
docs/LLM_TRANSCRIPT.md.
Running the bundled demos
python examples/automaton_demo.py # clockwork vs mixed vs noise-driven automata
python examples/csv_demo.py # round-trip through a CSV file
Minimum trajectory length per axis
Each axis ships with a hard floor below which the underlying estimator
refuses to run, and a soft floor below which the estimator is
mathematically valid but statistically noisy. Use these as a sizing
guide when generating synthetic trajectories or recording experimental
ones; the convenience factories (PromisedCycle.simple(),
SimpleAutomaton.demo()) already pick defaults that clear every floor.
| Axis | Hard floor | Soft floor (recommended) | Notes |
|---|---|---|---|
closure |
2 | ~200 | Estimator works on any 2-step transition; ~200 stabilises the conditional MI. |
memory |
500 | 1000+ | Hard limit; raises ValueError below 500. Crutchfield excess-entropy needs the block-saturation regime. |
constraint |
n/a | n/a | Reads the causal graph, not the trajectory. No timestep requirement. |
persistence |
horizon + 2 (66 with defaults) |
200+ | Soft floor scales with the chosen replay horizon; defaults assume horizon = 64, n_perturbations = 32. |
coherence |
2 | ~100 | Below 100 the Theil-U estimator emits a low-sample-size warning but still returns a value. |
When in doubt, generate at least 600 timesteps: that clears every
hard floor and matches the length=600 default of
PromisedCycle.simple().
Metrics
Three metrics ship in the current alpha. All three follow the PBA
internal over total shape, all three live in [0.0, 1.0], and all
three are exposed as pure numpy functions wired into Autonometer:
ratio_endo_total — Albantakis / Bertschinger closure
Normalised conditional mutual information of the system's next state on its own past, controlling for the environment:
$$A ;=; \frac{I(S_{t+1};,S_t \mid E_t)}{H(S_{t+1} \mid E_t)}$$
A = 0: the next state, given the environment, is independent of the system's own previous state (no closure, pure drift).A = 1: the next state, given the environment, is fully determined by the system's own previous state (closed dynamics).
memory_endo_ratio — distributed structural memory
Fraction of the structural memory present in the joint
(system, environment) trajectory that is carried by the system
itself, computed via Crutchfield's excess entropy on each component
and then normalised:
$$M ;=; \frac{E(\text{states})}{E(\text{states}) + E(\text{env})}$$
with E(.) estimated via block-entropy saturation
E = H(L) - L · h_μ, where h_μ = H(L) - H(L-1). The working block
length is capped by a Grassberger-style rule so every possible block
gets about ten samples on average.
M = 0: the joint memory lives entirely in the environment (or, by convention, neither sequence carries memory at all).M = 1: the joint memory lives entirely in the system.M ≈ 0.5: memory is shared roughly equally between system and environment.
This replaces the absolute-bit structural_memory shipped in
v0.3.x. Returning a magnitude in bits broke the unifying ratio
shape of the package; memory_endo_ratio recovers PBA coherence by
applying the same excess-entropy estimator to both components and
returning the fraction carried by the system.
constraint_closure — Montévil & Mossio-style organisational closure
Fraction of the system's update functions (constraints) that lie on at least one simple directed cycle of length 2 or 3 in the causal-dependency graph. The metric reads only the topology of the graph: it is deliberately information-theory-free, so any empirical correlation with the two axes above is structural rather than algebraic.
$$C ;=; \frac{|\{i : \exists \text{ simple cycle of length } 2 \text{ or } 3 \text{ through } i\}|}{n}$$
with n the number of constraints in the system and the dependency
matrix exposed by each adapter via get_causal_graph().
C = 0: no constraint is sustained by another distinct constraint of the same system through a short feedback loop. Single-node systems (a periodic cycle, aSimpleAutomaton) and pure feed-forward chains land here.C = 1: every constraint is on at least one such loop. Periodic-ring cellular automata land here because each cell is read by both of its neighbours, which read it back.- Length-1 cycles (self-loops) and length ≥ 4 cycles do not count: the metric targets the local "membrane ↔ metabolism" shape Montévil & Mossio describe, and the short-cycle restriction prevents systems that close only after a long detour from getting free credit.
Operationalisation choices, falsification predictions and the
domain-of-applicability discussion live in
docs/CONSTRAINT_CLOSURE.md.
All three scores are returned in a single AutonomyProfile with
Optional[float] fields, so unrequested metrics stay None.
Adapters that cannot expose a causal graph (e.g.
CSVTrajectory, where only trajectories are available) make the
orchestrator record None for constraint_closure rather than
abort the whole measurement.
The autonomy plane
Thinking of the two metrics together, rather than reducing autonomy
to a single number, gives a richer picture. Both axes share the PBA
ratio shape and live in [0, 1], so (closure, memory) defines a
canonical autonomy plane [0, 1] × [0, 1]. The four quadrants
fall out of a single 0.5 threshold on each axis:
| memory ↓ / closure → | low closure (< 0.5) | high closure (≥ 0.5) |
|---|---|---|
| low memory (< 0.5) | drift (noise-driven) | clockwork regularity |
| high memory (≥ 0.5) | turbulence / chaos | candidate autopoietic region |
- Drift (low closure, low memory): the system tracks the environment and keeps nothing.
- Clockwork (high closure, low memory): determined by its own past, but the environment also carries comparable memory; the system's contribution to joint memory is modest.
- Turbulence (low closure, high memory): the environment shapes the system, and the bulk of the joint memory still ends up associated with the system's trajectory rather than the environment's — long-range but non-self-generated structure.
- Autopoietic region (high closure, high memory): closed dynamics and the joint memory is dominated by the system itself — the empirically interesting corner for systems with non-trivial self-organisation.
The package does not claim to prove autopoiesis. It gives a two-coordinate reading on a homogeneous plane and lets the interpreter argue.
Benchmark
v0.5.0a0 shipped the first reference benchmark on two axes;
v0.6.0a0 extended it to the third axis (constraint_closure);
v0.7.0a0 added the fourth axis (rai_proxy_persistence)
without changing the system zoo. v0.7.2a0 (this release)
re-runs the four-axis benchmark with n_seeds raised from 5 to
30 to clear the 200-valid-point floor pre-registered in
docs/ATLAS_GEOMETRY.md for the
atlas-geometry analysis. Adapter classes and parameter values
are unchanged, so all results are directly comparable with the
two prior baselines. The intent is not to score one system as
"more autonomous" than another. It is to check whether the four
axes carry distinct information for the systems we can generate
today, before adding a fifth axis to PBA.
Reproducing the run:
pip install -e ".[dev]"
python examples/benchmark_demo.py # writes docs/benchmarks/v0.7.2a0.csv
pip install -e ".[dev,viz]"
python examples/benchmark_plot.py # writes docs/benchmarks/v0.7.2a0.png
Headline numbers from the snapshot shipped here
(docs/benchmarks/v0.7.2a0.csv):
| Quantity | Value |
|---|---|
| Configurations swept | 405 |
| Fully-valid points | 247 |
| Configurations dropped (n/a) | 158 (39%) |
| Pearson r(closure, memory) | +0.27 |
| Pearson r(closure, constraint) | +0.04 |
| Pearson r(closure, persistence) | -0.61 |
| Pearson r(memory, constraint) | -0.52 |
| Pearson r(memory, persistence) | -0.33 |
| Pearson r(constraint, persistence) | -0.07 |
| Spearman r(closure, memory) | +0.47 |
| Spearman r(closure, constraint) | -0.20 |
| Spearman r(closure, persistence) | -0.47 |
| Spearman r(memory, constraint) | -0.34 |
| Spearman r(memory, persistence) | -0.33 |
| Spearman r(constraint, persistence) | -0.01 |
| Falsification threshold | ` |
| Aggregate diagnosis | OK |
The 158 dropped configurations correspond to systems whose focal
trajectory collapses to a constant or to a value fully determined
by the environment, in which case H(S_{t+1} | E_t) = 0 and the
closure ratio is undefined by construction. They concentrate
on ECASystem (55% adapter-internal dropout) and
KauffmanNetwork (51%); PeriodicCycle and SimpleAutomaton
have zero dropouts. The pattern is itself a structural finding —
the metric set has a joint blind spot selective for the cellular
and network adapters — and is documented in
docs/ATLAS_GEOMETRY.md. Dropouts are
kept in the CSV with empty metric columns so the dropout is
visible rather than hidden.
All six pairwise Pearson correlations stay below the
|r| < 0.7 falsification threshold documented in
docs/PBA.md, now on the extended sample of 247
valid points. The aggregate flag is the worst of the six
pairwise flags so a single overlap is enough to raise it.
The three pairs involving the persistence axis sit inside the
|r| < 0.7 band on the extended sample as well:
closure-persistence at −0.61, memory-persistence at
−0.33, and constraint-persistence at −0.07. This is the
empirical correlation gate pre-registered in
docs/RAI.md ("Empirical correlation |r| < 0.7
on the benchmark zoo"). Together with the static no-cross-import
audit baked into compute_rai_proxy_persistence, it is the
falsification criterion the fourth axis had to clear before
being considered a fourth dimension of the autonomy atlas rather
than a re-skin of an existing axis. The closure–persistence
correlation has tightened from −0.44 (v0.7.0a0) to −0.61
(v0.7.2a0) on the larger sample, but still sits below the
falsification threshold; the same closure–persistence pair
also flips to −0.07 within KauffmanNetwork and to −1.00
within SimpleAutomaton, raising the Simpson's-paradox health
flag analysed in
docs/ATLAS_GEOMETRY.md.
The third axis cleanly breaks the closure-saturation wall
identified in v0.5.0a0: single-node periodic cycles and
self-generated SimpleAutomaton systems, which previously sat
indistinguishably from ECA rings on the vertical line
closure = 1.0, now drop to constraint = 0.0 while ECA rings
stay at constraint = 1.0. The wall is therefore not resolved
(closure still saturates by construction in fully-observed
deterministic systems) but it is no longer the only readable
signal.
A scatter rendering of the same CSV — points placed on the
(closure, memory) plane, with marker size proportional to the
constraint axis — is shipped at
docs/benchmarks/v0.7.2a0.png,
and the captured stdout of the reference run lives at
docs/benchmarks/v0.7.2a0.log.txt
for traceability. The persistence axis is reported in the CSV's
fourth metric column and is rendered separately as a domain-of-
applicability curve under
docs/benchmarks/persistence_v0.7.0.png.
The two-axis baseline from v0.5.0a0
(csv /
png), the three-axis snapshot
from v0.6.0a0 (csv /
png), and the four-axis
short-sample snapshot from v0.7.0a0
(csv /
png) are kept under
docs/benchmarks/ for traceability.
Atlas geometry analysis (v0.7.2a0)
v0.7.2a0 ships a pre-registered geometric audit of the
four-axis cloud, designed before any extended-sweep data was
seen. The full pre-registration, threshold table, implementation
report, and verdict live in
docs/ATLAS_GEOMETRY.md. Headline
indicators on the 247 valid points:
| Indicator | Value | Pre-registered band |
|---|---|---|
λ_1 |
0.469 |
[0.40, 0.70) — inconclusive |
λ_1 + λ_2 |
0.809 |
[0.65, 0.85) — partial low-D |
s(k* = 5) (silhouette, k-means) |
0.642 |
≥ 0.50 — strong cluster |
| Adapter-class alignment | 4 of 5 | clusters dominated by one class |
The combination is not a clean fit to any of the three
pre-registered outcomes (Level 2 reinforced, inconclusive,
Level 3 suspected); per the resolution rule pre-registered in
docs/ATLAS_GEOMETRY.md, the verdict
is
Inconclusive on the level question (PCA reading), with a Level-3-suggestive overlay (clustering reading).
A Simpson's-paradox health flag is also raised: several global
pairwise correlations are partly artefacts of the substrate
composition of the zoo (the most extreme case is
closure–persistence, global −0.61 vs −1.00 within
SimpleAutomaton alone). The level question — Level 2 (one
multidimensional object) vs Level 3 (several objects sharing a
label) — is therefore genuinely under-determined on the
structural domain and is deferred to external studies built
on the v0.9.0 LLM adapter for a clean arbitration. The
package ships the instrument; the arbitration runs in studies
that import it.
Reproducing the analysis:
pip install -e ".[dev]"
python examples/atlas_geometry.py # writes docs/benchmarks/atlas_geometry_v0.7.2a0.json
pip install -e ".[dev,viz]"
python examples/atlas_geometry_plot.py # writes docs/benchmarks/atlas_geometry_v0.7.2a0.png
The biplot
(docs/benchmarks/atlas_geometry_v0.7.2a0.png)
renders the PCA scree (with the pre-registered λ_1 ≥ 0.70
reference line) and the PC1/PC2 projection of the standardised
4-D cloud with axis loadings drawn as labelled arrows. t-SNE /
UMAP panels are deliberately omitted; the pre-registration
flagged them as illustrative-only because they can manufacture
visual clusters from isotropic noise.
Saturation diagnostic (v0.5.1a0)
The most visible feature of the scatter above is the vertical wall
of points at closure = 1.0. The v0.5.1a0 diagnostic shows that
this wall is a theorem about the metric, not a flaw: any
deterministic system whose observed (S, E) pair already covers
every variable the transition rule depends on satisfies
H(S_{t+1} | S_t, E_t) = 0, which forces
I(S_{t+1}; S_t | E_t) = H(S_{t+1} | E_t), which forces
closure = 1.0. Three of the four adapter classes in the benchmark
(ECA, PeriodicCycle, self-generated SimpleAutomaton) satisfy
those preconditions; the fourth (KauffmanNetwork) breaks them on
purpose, which is why it is the only adapter whose closure values
vary continuously across the unit interval.
To verify the theorem empirically, the diagnostic injects
Bernoulli bit-flip noise into the focal trajectory of a saturating
ECA (rule 110) at probabilities p ∈ {0, 0.01, …, 0.50} and
re-measures closure. The expected behaviour is a smooth, monotonic
fall off the wall.
pip install -e ".[dev]"
python examples/saturation_diagnostic.py # writes docs/benchmarks/saturation_v0.5.1.csv
pip install -e ".[dev,viz]"
python examples/saturation_plot.py # writes docs/benchmarks/saturation_v0.5.1.png
Headline numbers from the snapshot shipped here
(docs/benchmarks/saturation_v0.5.1.csv, 10 noise levels × 5 seeds
= 50 valid points):
Noise probability p |
closure (mean ± std) | memory (mean ± std) |
|---|---|---|
| 0.00 | 1.000 ± 0.000 | 0.569 ± 0.033 |
| 0.01 | 0.810 ± 0.051 | 0.536 ± 0.021 |
| 0.05 | 0.434 ± 0.055 | 0.405 ± 0.016 |
| 0.10 | 0.234 ± 0.032 | 0.284 ± 0.045 |
| 0.20 | 0.059 ± 0.010 | 0.121 ± 0.030 |
| 0.50 | 0.001 ± 0.001 | 0.070 ± 0.010 |
The full curve is rendered at
docs/benchmarks/saturation_v0.5.1.png.
Two practical reads:
- The wall at
closure = 1.0is fragile, not robust. A 1 % per-step bit-flip rate already drops closure to0.81, and closure converges to zero byp ≈ 0.3. - A closure value strictly below 1.0 is therefore informative. In practice it signals partial observation, stochastic dynamics or measurement noise — exactly the three failure modes the metric is designed to detect.
The formal statement of the theorem and its consequences for PBA
live in docs/PBA.md § "Domain of applicability"
(Spanish: docs/PBA.es.md).
Constraint-closure density diagnostic (v0.6.1a0)
The constraint-closure axis introduced in v0.6.0a0 carries
its own pair of boundary regions, formalised as theorems and
verified with the same diagnostic-grade rigour the closure axis
got in v0.5.1a0:
- Theorem A — single-constraint trivial-zero. Any system
with
n = 1update function returnsconstraint = 0.0by construction (a simple cycle of length 2 or 3 requires at least two distinct nodes). CoversPeriodicCycleandSimpleAutomaton. - Theorem B — symmetric-neighbour saturation. Any graph in
which every node reads at least one node that reads it back
returns
constraint = 1.0by construction (every node sits on a length-2 cycle). CoversECASystemon any non-trivial periodic ring.
To verify both theorems jointly, the diagnostic sweeps the
connection density of a controllable system. For each
K ∈ {1, …, n − 1} it generates several Kauffman networks of
size n and computes constraint_closure directly from the
causal graph (no trajectory needed; the metric is purely
topological).
pip install -e ".[dev]"
python examples/constraint_density_diagnostic.py # writes docs/benchmarks/constraint_density_v0.6.1.csv
pip install -e ".[dev,viz]"
python examples/constraint_density_plot.py # writes docs/benchmarks/constraint_density_v0.6.1.png
Headline numbers from the snapshot shipped here
(docs/benchmarks/constraint_density_v0.6.1.csv, 9 K values × 10
seeds = 90 measurements at n = 10):
Input degree K |
constraint (mean ± std) |
|---|---|
| 1 | 0.140 ± 0.120 |
| 2 | 0.520 ± 0.236 |
| 3 | 0.790 ± 0.138 |
| 4 | 0.950 ± 0.067 |
| 5 | 0.980 ± 0.060 |
| 6 | 1.000 ± 0.000 |
| 7 | 1.000 ± 0.000 |
| 8 | 1.000 ± 0.000 |
| 9 | 1.000 ± 0.000 |
The full curve is rendered at
docs/benchmarks/constraint_density_v0.6.1.png.
Two practical reads:
- The metric is a monotone sigmoid in connection density.
At
K = 1it sits near Theorem A's lower boundary; atK ≥ 6(withn = 10) every seed identically saturates at1.0(std = 0), reaching Theorem B's upper boundary. - A constraint-closure value of
0.0does not mean "no autonomy". On a single-constraint adapter the metric is silent by Theorem A; the system simply sits outside its discriminative domain. Symmetrically,1.0does not mean "fully autonomous" — on a dense periodic ring it is forced by Theorem B regardless of dynamical content.
Both theorems and the diagnostic are documented in
docs/CONSTRAINT_CLOSURE.md § "Domain of applicability"
and reflected in
docs/PBA.md § "Domain of applicability"
(Spanish: docs/PBA.es.md).
Persistence-vs-coupling diagnostic (v0.7.0a0)
The persistence axis added in v0.7.0a0 ships with a first
domain-of-applicability run that sweeps the focal coupling of
a KauffmanNetwork. The naïve expectation was a monotone rise
("low coupling → focal flip propagates → low persistence; high
coupling → focal flip invisible → high persistence"). The
diagnostic falsified that expectation and revealed a
U-shape with two trivial-absorption boundary regimes flanking
a non-trivial middle.
pip install -e ".[dev]"
python examples/persistence_diagnostic.py # writes docs/benchmarks/persistence_v0.7.0.csv
pip install -e ".[dev,viz]"
python examples/persistence_plot.py # writes docs/benchmarks/persistence_v0.7.0.png
Headline numbers from the snapshot shipped here
(docs/benchmarks/persistence_v0.7.0.csv, 11 coupling levels × 10
seeds at n = 10, k = 3):
| Focal coupling | persistence (mean ± std) | n_valid / n_total |
|---|---|---|
| 0.00 | 1.000 ± 0.000 | 6 / 10 |
| 0.10 | 1.000 ± 0.000 | 6 / 10 |
| 0.20 – 0.50 | 0.556 ± 0.453 | 8 / 10 |
| 0.60 – 0.80 | 0.415 ± 0.465 | 9 / 10 |
| 0.90 | 0.665 ± 0.406 | 9 / 10 |
| 1.00 | 0.665 ± 0.406 | 9 / 10 |
The full curve is rendered at
docs/benchmarks/persistence_v0.7.0.png.
Two practical reads:
- Low-coupling boundary (coupling ≈ 0). Most seeds drop
because the focal trajectory collapses to a constant (a 1-bit
rule generally has a fixed point). The seeds that survive
score
persistence ≈ 1not because the system defends its trajectory, but because the perturbation is absorbed by the fixed point in one step. This is trivial absorption by collapse, not autonomy. - High-coupling boundary (coupling ≈ 1). The focal node
ignores its own previous value, so the focal flip never enters
the rule that computes the focal at
t_star + 1. The metric returnspersistence ≈ 1by construction. This is trivial absorption by invisibility, again not autonomy.
The non-trivial useful range of the metric on Kauffman networks
sits in the intermediate couplings, where actual perturbation
propagation is observed. The U-shape is the persistence analogue
of the closure-saturation theorem (v0.5.1a0) and the
symmetric-neighbour saturation theorem for constraint-closure
(v0.6.1a0): the metric has structurally trivial regions at the
edges of its parameter space and a non-trivial useful range in
the middle. Formalising the two boundary theorems for persistence
(jointly with the deferred perturbation-magnitude sweep) is the
planned content of the v0.7.1 maintenance cycle.
Adapters
SimpleAutomaton— two factory constructors (from_self_generated_rules,from_external_rules) for synthetic toy systems.CSVTrajectory— loads a user-supplied two-column CSV with discrete integerstateandenvcolumns.LLMTranscriptAdapter— off-line transcripts in OpenAI / Anthropic Messages format (JSONL viafrom_jsonl, in-memory viafrom_messages). Enablesclosure,memoryandcoherence; reportsNoneforconstraint(no causal graph in transcripts) andpersistence(off-line cannot replay) under mosaic dropout. Contract:docs/LLM_TRANSCRIPT.md. The on-line counterpart enablingpersistenceon LLMs (LLMLiveAdapter) is planned for a later alpha.
Any object implementing get_state_history() and get_env_history()
(both returning 1D integer np.ndarray) satisfies the
AutonomySystem protocol and can be passed to Autonometer.measure.
Theoretical grounding
autonometrics does not introduce a new theory of autonomy. It
operationalises a recurring ratio of internal over total shape
that already runs through several traditions in the structural
self-determination literature. The intended scope and the
falsification criteria for that unifying claim are stated in
docs/PBA.md; the references the current axes
build on are:
- Bertschinger, N., Olbrich, E., Ay, N., & Jost, J. (2008). Autonomy: An Information-Theoretic Perspective. BioSystems — introduces the conditional-information core measured by the closure axis.
- Albantakis, L. (2021). Quantifying the Autonomy of Structurally Diverse Automata: A Comparison of Candidate Measures. Entropy — comparative review and the normalisation form used here.
- Crutchfield, J. P., & Young, K. (1989). Inferring statistical complexity. Physical Review Letters — introduces excess entropy, the engine behind the memory axis.
- Feldman, D. P., & Crutchfield, J. P. (2002). Measures of
Statistical Complexity: Why?. Physics Letters A — formal critique
of LMC-style "balance" measures that drove the migration done in
v0.3.0-alpha. - Farnsworth, K. D. (2018). How Organisms Gained Causal Independence and How It Might Be Quantified. Biology — argues that autonomous agency requires two jointly necessary features (organisational closure and an internalised objective-function providing a 'goal'), and proposes Integrated Information Theory as a possible quantification. The two-axis shape of the plane here is inspired by this dual-feature thesis, not a literal implementation of it: the memory ratio is a structural proxy for ongoing activity, not an objective-function measurement.
- Montévil, M., & Mossio, M. (2015). Biological organisation as
closure of constraints. Journal of Theoretical Biology — formal
framework where biological organisation is read as mutual
dependence among constraints (each both produced by and producing
the others). Reference for the
constraint_closureaxis shipped inv0.6.0a0; the operational mapping from the paper's primitives to the package's discrete-graph implementation is documented indocs/CONSTRAINT_CLOSURE.md.
Related work
autonometrics overlaps with several adjacent toolkits. The
relevant difference is framing, not the underlying mathematics:
none of the tools below collects information-theoretic measures
under a single internal-over-total ratio convention, and most are
either deeper (and substrate-bound) or more general (and unframed).
Albantakis/autonomy— toolbox for comparing autonomy measures on small simulated agents represented as transition probability matrices. Authored by the same researcher whose 2021 review article this package builds on. Requires PyPhi, runs on Linux and macOS only, and outputs a multi-column DataFrame intended for cross-measure comparison rather than a single normalised reading.autonometricsdoes not depend on it: the closure-axis formula is reimplemented in purenumpyso the package runs on Linux, macOS and Windows alike, and so it accepts arbitrary discrete trajectories instead of pre-built transition probability matrices.- JIDT — Java toolkit for information-theoretic measures (transfer entropy, active information storage, predictive information / excess entropy, mutual information). The numerical engine many adjacent papers build on; cross-language but not a unified autonomy index.
- PyInform — Python wrapper
on the Inform C library. Provides
active_info,block_entropy,entropy_rate,transfer_entropyand related building blocks. - PyPhi — reference IIT implementation (Φ, MIP) on transition probability matrices. A different formalism from the closure axis used here; a possible optional dependency for an IIT-based axis later in the roadmap.
- dit — general discrete information theory toolkit (PID, divergences, multivariate measures). A candidate numerical dependency for future axes that need partial information decomposition.
The cross-platform, dependency-light stance is deliberate. The
package targets researchers, students and applied users who want a
working measurement on whatever machine they have, including
Windows. Pure-numpy implementations are preferred over
heavier dependencies until a future axis genuinely needs them; if
that ever happens, the heavy dependency will be opt-in via
extras_require rather than mandatory.
Roadmap
Each future alpha adds one more [0, 1]-valued ratio drawn from
the structural self-determination literature, keeping the same
PBA convention so all axes remain comparable. The order below
prioritises enabling empirical validation of the existing
axes (the validation itself runs in external studies) before
broadening them: the first benchmark run shipped in v0.5.0a0
established that closure and memory carry distinct
information on the current adapter zoo, the v0.5.1a0
diagnostic mapped the closure = 1.0 saturation wall,
v0.6.0a0 added the third axis to break that wall while
preserving pairwise independence, v0.6.1a0 mapped the two
saturating regions of constraint_closure, v0.7.0a0 added
the fourth axis (rai_proxy_persistence) and revealed its U-
shaped domain of applicability, v0.7.2a0 ran a
pre-registered geometric audit of the four-axis cloud whose
verdict pushed the Level 2 vs Level 3 question to studies built
on v0.9.0's LLM adapter,
and v0.8.0a0 added the fifth axis (coherence / Theil's U)
together with its reference adapter PromisedCycle, ran the
Session B diagnostic block (independence audit + causal
experiment with p_env) that overrode the pre-registered hard
gate on causal grounds, and shipped the mosaic-atlas
verdict (n_valid_full = 0/645: no system in the current
zoo has all five axes simultaneously, so the atlas is best
read as overlapping four-axis sub-charts rather than a single
five-dimensional cloud).
v0.5.0-alpha: benchmark suite + scatter plot. Reference systems (ECASystem,KauffmanNetwork,PeriodicCycle) wired into a sweep that measures(closure, memory)over multiple seeds and reports correlation against the PBA falsification threshold.v0.5.1-alpha: saturation diagnostic. Bernoulli bit-flip noise sweep on a saturating ECA, formal statement of the closure-saturation theorem, and the "domain of applicability" section indocs/PBA.md.v0.6.0-alpha: third axis —constraint_closure(Montévil & Mossio-style). Per-adapter causal-graph implementations, three-axis benchmark snapshot with three pairwise correlations, and an independence-by-design audit.v0.6.1-alpha: domain-of-applicability diagnostic forconstraint_closure. Formal statement of the single-constraint trivial-zero theorem and the symmetric-neighbour saturation theorem; Kauffman density sweep snapshot underdocs/benchmarks/constraint_density_v0.6.1.*.v0.7.0-alpha: fourth axis —rai_proxy_persistence(Lee & McShea-style perturbation persistence, RAI-style structural proxy). Adapter-sidereplay_from_perturbationprotocol, four-axis benchmark snapshot with six pairwise correlations (all|r| < 0.7), and a first persistence-vs-coupling diagnostic that revealed the U-shape boundary regimes. Pre-registered indocs/RAI.md.v0.7.1-alpha: domain-of-applicability cycle forrai_proxy_persistence. Formal statement of the two boundary theorems (low-coupling collapse and high-coupling invisibility), perturbation-magnitude sweep already deferred there indocs/RAI.md, and the same diagnostic-grade rigour the prior axes already enjoy. Intentionally skipped in the chronological release order; the boundary theorems and the magnitude sweep land here.v0.7.2-alpha: pre-registered atlas-geometry analysis. PCA- k-means + silhouette + conditional correlations on the
extended four-axis benchmark (
n_valid = 247). Pre-registered indocs/ATLAS_GEOMETRY.md. Verdict: inconclusive on the level question (PCA reading), with a Level-3-suggestive overlay (clustering reading); the level question is deferred to external behavioural-validation studies built onv0.9.0's LLM adapter.
- k-means + silhouette + conditional correlations on the
extended four-axis benchmark (
v0.8.0-alpha(current): fifth axis — coherence-based alignment (cba_theil_u, Theil's U on declared vs executed trajectories with Miller-Madow bias correction). New reference adapterPromisedCyclewith optional independent declared-channel noise (p_env); new optional protocol methodget_declared_executed; newAutonomyProfile.cba_theil_ufield. Pre-registered indocs/CBA.md. Five-axis benchmark snapshot atdocs/benchmarks/v0.8.0a0.{csv,log.txt}. Session B ships three diagnostic snapshots:cba_independence_v0.8.0a0.{json,png,log.txt}(stratified audit, Simpson's-paradox visualisation),cba_env_decouple_v0.8.0a0.{json,png,log.txt}(causal experiment withp_env,r(closure, coherence)falls from+0.97to+0.48andr(coherence, p_env) = +0.0007confirms Theil's U invariance), and the v0.8.0a0 follow-up ofATLAS_GEOMETRY.md(Step 7 verdict:n_valid_full = 0/645, atlas is a mosaic of overlapping four-axis sub-charts). Pre-registered hard gate (|r| ≥ 0.9) was triggered by the headline+0.96and then overridden on causal grounds with the override documented in the post-mortem section ofdocs/CBA.md. Level question pulled toward Level 3, decided (or kept open) by external studies built onv0.9.0's LLM adapter.v0.9.0-alpha(current cycle): LLM transcript adapter (off-line).LLMTranscriptAdaptertranslates standard OpenAI / Anthropic Messages format into theAutonomySystemprotocol. Enablesclosure,memoryandcoherenceon real conversational data; reportsNoneforconstraint(no causal graph in transcripts) andpersistence(off-line cannot replay) under the mosaic-dropout policy. This release ships the instrument, not the validation: the behavioural validation against C-RAI, goal-directedness scoring on transcripts and CoT-faithfulness, and the empirical arbitration of the Level 2 vs Level 3 question, are deferred to external studies that importautonometricsas a dependency. The package's job here is to make those studies possible. Contract:docs/LLM_TRANSCRIPT.md. An on-line counterpart (LLMLiveAdapter) enablingpersistenceon live API endpoints is planned for a later cycle.v1.0.0(without alpha marker): PyPI publication once five ratios, three adapters, and the full benchmark battery are stable.
License
MIT. See LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file autonometrics-0.9.0a0.tar.gz.
File metadata
- Download URL: autonometrics-0.9.0a0.tar.gz
- Upload date:
- Size: 1.2 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3d4d82f8fb2cb10d6fef6c9ebf126309fbc02c2118bf8bc41a8ea0df5faa3fa2
|
|
| MD5 |
24c06610087b6f29c1dea5c026f816df
|
|
| BLAKE2b-256 |
01e94d42e3f416319291e789a9ebbd1641eea8e5624673d46ebc0fedc3bbdd08
|
File details
Details for the file autonometrics-0.9.0a0-py3-none-any.whl.
File metadata
- Download URL: autonometrics-0.9.0a0-py3-none-any.whl
- Upload date:
- Size: 69.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d172361d125981cb696b0298834e056240217856dae4ba8125ea3db6ad124c2f
|
|
| MD5 |
2bf4ea97e7f965fce2a85bfae57999e5
|
|
| BLAKE2b-256 |
373b950e7a16e186491937947c45a20f4b34421e3971cd83556ce384551ce8ba
|