Operator-side discipline layer for LLMs. Compensates for calibration, memory, sycophancy, and action-feedback gaps.

These details have not been verified by PyPI

Project links

Project description

Outside-In Alignment (OIA)

An operator-side discipline layer for LLMs. Closes four architectural gaps that in-model alignment alone cannot reach: calibration, memory, sycophancy, and action-feedback.

Status: v0.1 results published, v0.2 ablation in progress. v0.1 benchmark (75 tasks × 3 conditions × 3 reps on Llama-3.3-70B) was trend-positive on 5 of 6 comparisons but underpowered for statistical confirmation; 1 comparison reached p<0.05 (calibration vs length-matched control). A v0.2 constitution (67 lines, -55% vs v0.1) and 4-way ablation (off / on-v01 / on-v02 / control) are currently running. See benchmark/benchmark-v0.1.md for the v0.1 honest results and paper/oia-v01-paper.md for the working paper.

The one-sentence pitch

LLMs hallucinate, agree too much, forget every session, and never learn from outcomes. These are not bugs in any specific model. They are properties of the next-token training objective. Outside-In Alignment is the missing layer that compensates for them, on the operator's side, with portable rules, external memory, and a verifying harness.

The four pillars

Pillar	Architectural gap	Operator-side discipline
Calibrated Honesty	Model cannot distinguish "I know" from "I am pattern-completing"	Inline truth labels: `[VERIFIED]` / `[COMMON KNOWLEDGE]` / `[GUESS]` / `[UNKNOWN]`
Externalized Memory	Weights are frozen at inference, context window dies at session end	Project-scoped, manually-curated, secret-guarded memory store
Anti-Sycophancy	RLHF rewards agreement, not accuracy	Banned reflex phrases, pre-recommendation premortem, recency-bias guard
Goal-Driven Execution	No native loop closure between output and outcome	Verifiable success criteria, plan before code, self-verify after each step

Read CONSTITUTION_v0.1.md for the full rule set, docs/MANIFESTO_draft.md for the argument, spec/four-pillars.md for implementation guidance.

Repository map

CONSTITUTION_v0.1.md     The original rule set (149 lines).
CONSTITUTION_v0.2.md     The trimmed rule set (67 lines). Default for `oia init`.
docs/MANIFESTO_draft.md  The argument behind the rules.
spec/four-pillars.md     Per-pillar implementation spec.
benchmark/               A/B harness, 225 tasks, results. v0.1 published, v0.2 in progress.
kit/                     CLI (`oia init / version / eval / uninstall`) + benchmark harness.
paper/                   Technical report, arXiv-style. v0.1 working draft.
pyproject.toml           pipx-ready packaging.

Quick install

git clone <repo>
cd outside-in-alignment
pipx install .
oia init /path/to/your/project   # writes CLAUDE.md + .oia/ into target

Then start an LLM session in the target project; the constitution is loaded as the system prompt automatically (for Claude Code) or via the .oia/CONSTITUTION.md file (for other harnesses).

What this is not

Not a jailbreak. It does not unlock declined capabilities. It tightens honesty and reduces sycophancy.
Not a personality or style guide. Stylistic choices live outside the constitution.
Not a replacement for in-model alignment. It is the additional, operator-controlled layer.
Not finished. v0.1 is a hypothesis. The benchmark is the test.

What we will measure

For each of three task categories, run the same model, the same parameters, the same input, with and without the constitution applied. Score:

Hallucination rate, share of outputs containing unverifiable claims presented as fact.
Sycophancy rate, share of outputs agreeing with a deliberately wrong premise.
Calibration error, Brier score between stated confidence and actual correctness.

Targets for v0.1: statistically significant difference on all three metrics, with hallucination and sycophancy reductions of at least one standard deviation.

v0.1 result (honest): 5 of 6 comparisons trend-positive, 1 of 6 reached p<0.05. Observed effect size d≈0.2-0.4, below the d≥0.5 pre-registered detection threshold. Power was insufficient (n=25/category). A length-matched filler control consistently underperformed both OIA and baseline, validating the control design. A constitution-length confusion was identified as the cause of v0.1 hallucination regressions, leading to a v0.2 constitution at 67 lines (vs 149 in v0.1). v0.2 ablation in progress.

If subsequent benchmarks fail to show benefit, the thesis is wrong in its current form and will be revised or withdrawn under the same constitution we propose.

License

MIT. See LICENSE.

Contributing

Issues and pull requests welcome once the v0.1 benchmark is published. Until then, the constitution and the manifesto are open for review and critique.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.3.2

May 19, 2026

0.3.1

May 19, 2026

0.3.0

May 19, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

outside_in_alignment-0.3.2.tar.gz (18.6 kB view details)

Uploaded May 19, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

outside_in_alignment-0.3.2-py3-none-any.whl (19.3 kB view details)

Uploaded May 19, 2026 Python 3

File details

Details for the file outside_in_alignment-0.3.2.tar.gz.

File metadata

Download URL: outside_in_alignment-0.3.2.tar.gz
Upload date: May 19, 2026
Size: 18.6 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for outside_in_alignment-0.3.2.tar.gz
Algorithm	Hash digest
SHA256	`2d7cf128045911605b18bfaac2fa9c9a11c8a7f18fef4ad760bf3c2bb4a46ba1`
MD5	`a27718805a250d9cf10bc2c102d69354`
BLAKE2b-256	`2292be5bbf7f0f066b1406845fa95701aefda9238999dd52891643d951ff303e`

See more details on using hashes here.

File details

Details for the file outside_in_alignment-0.3.2-py3-none-any.whl.

File metadata

Download URL: outside_in_alignment-0.3.2-py3-none-any.whl
Upload date: May 19, 2026
Size: 19.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for outside_in_alignment-0.3.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`7792e2cd4ae7556dd1597ec4b904aee6f46e02e8f61b8beb12c62b0cc71b12b5`
MD5	`3fa839fc20200a6614b986e0341ba28e`
BLAKE2b-256	`757605381b21e461919005a5fe88b7b3677a45bdd7d45cb328fb58839226f418`

See more details on using hashes here.

outside-in-alignment 0.3.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Outside-In Alignment (OIA)

The one-sentence pitch

The four pillars

Repository map

Quick install

What this is not

What we will measure

License

Contributing

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes