CARL (Coherence-Aware Reinforcement Learning) — information-theoretic reward signals for LLM training via token-level probability distributions
Project description
CARL
Coherence-Aware Reinforcement Learning
Why
A model becomes an agent when it stops pattern-matching and starts knowing. That transition isn't gradual — it's a phase transition, like water becoming ice. One moment the model is guessing. The next, it's coherent.
Standard training can't see this happening. You watch a loss curve and hope.
CARL measures the moment of crystallization — and rewards it.
Phi (order parameter)
│
guessing │ knowing
░░░░░░░░░░░░░░░░░░░░░░░░│████████████████████████
│
crystallization
The order parameter Phi measures how coherent a model's probability field is at every token. When Phi crystallizes, the model has found its internal anchor — a fixed point it can navigate from to any concept space without losing itself.
This is alignment you can measure, not just evaluate.
Quick start
Measure coherence on any logits distribution — no training, no GPU, no API key.
Pure numpy:
from carl_core import CoherenceProbe, KAPPA, SIGMA
import numpy as np
vocab_size = 32_000
probe = CoherenceProbe(vocab_size=vocab_size)
# Any [T, V] logits + [T] chosen tokens. Here: 16 tokens from a 32k vocab.
logits = np.random.randn(16, vocab_size)
token_ids = np.argmax(logits, axis=-1)
snap = probe.measure(logits, token_ids)
print(f"phi_mean = {snap.phi_mean:.3f} (crystallization target: ≥ {SIGMA})")
print(f"horizon = KAPPA·d ≈ {int(KAPPA * vocab_size):,} tokens")
Install (just the observables layer):
pip install carl-studio
That gives you carl-core + the base CLI + one-shot observe. For training + HF + Claude observability:
pip install 'carl-studio[quickstart]'
Full extras matrix, reproducible installs via uv.lock, and conflict rules (e.g.
wallet vs x402) live in docs/INSTALL.md.
CLI quickstart
carl init # one-shot setup: account, provider, extras, project, consent
carl chat # agent — interactive loop
carl ask "train a small model on gsm8k" # agent — one-shot prompt
carl research search "coherence-aware reinforcement learning"
carl init is idempotent: re-running it after setup does nothing unless you pass --force. A first-run marker lives at ~/.carl/.initialized.
Bare carl is an entry surface, not a documented top-level workflow by itself:
- on a TTY, first run can route into
carl init, and a configured project can route into chat - on non-TTY input, bare
carlprints help plus a nudge towardcarl chatandcarl ask
Auth
CARL Studio does not require a .env, and it does not auto-load one.
- Hugging Face workflows work with either
HF_TOKENor a priorhf auth login/huggingface-cli login - Claude-powered features use
ANTHROPIC_API_KEYor--api-key - RunPod uses
RUNPOD_API_KEY - public Trackio observe works without credentials
If you want a template, copy .env.example and load it into your shell before running carl:
cp .env.example .env
set -a
source .env
set +a
Quick setup:
hf auth login
export ANTHROPIC_API_KEY=sk-ant-xxx # only for --diagnose / chat
carl start
Full auth details: docs/auth.md
Primary commands
| Command | What it does |
|---|---|
carl init |
One-shot setup: account, provider, extras, project, consent. |
carl chat |
Interactive agent loop with tools, sessions, cost tracking. |
carl ask "<prompt>" |
One-shot agent invocation. |
carl research search "<query>" |
Search and retrieve research papers (carl-studio[research]). |
carl flow "/a /b /c" |
Chain named operations, emit a shared interaction trace. |
carl doctor |
Readiness audit. Prints blocking issues and freshness findings. |
carl train |
Local training with coherence rewards (carl-studio[training]). |
Run carl start --inventory for the full installed command map, or carl flow --list for every chainable op.
Architecture
carl-core— primitive layer. Typed errors, retry/backoff, safepath sandboxing, content hashing, tier gating, coherence math, interaction chains. Zero training deps.carl-studio— the CLI, agent loop, training pipeline, MCP server, camp client, eval sandbox. Everything above builds oncarl-core.
carl-core is installed alongside carl-studio; public callers import from carl_core.* directly. The legacy carl_studio.primitives shim was removed after v0.5.0.
Error contract
Fatal paths raise carl_core.errors.CARLError subclasses with stable codes you can match programmatically. Top codes:
| Code | Meaning |
|---|---|
carl.error |
Base class. Generic failure. |
carl.config |
Invalid or missing configuration. |
carl.validation |
Input failed schema / value validation. |
carl.credential |
Missing or expired credential. |
carl.network |
Transient or persistent network failure. |
carl.budget |
Spend cap exceeded. |
carl.permission |
Permission / consent gate failed. |
carl.timeout |
Operation exceeded its deadline. |
carl.freshness.stale_pkg |
Installed package older than recommended floor. |
carl.freshness.camp_session_expired |
carl.camp session needs carl camp login. |
carl.eml.depth_exceeded |
EML tree exceeded depth bound. |
carl.eml.domain_error |
EML operator applied outside its valid domain. |
carl.eml.decode_error |
EML canonical-encoding decode failed. |
carl.eml.signature_mismatch |
Signed EML head failed HMAC verification. |
CARLError.to_dict() produces a secrets-redacted, telemetry-safe payload. See packages/carl-core/src/carl_core/errors.py for the full hierarchy.
Use
See inside a Trackio run (no GPU required, base install):
carl observe --url https://your-trackio-space.hf.space/ --run your-run
If the dashboard contains multiple projects, add --project your-project.
Train with coherence rewards (carl-studio[training]):
carl project init
carl train --config carl.yaml
carl run list
Or run directly from the CLI:
carl train --model your-org/your-base-model --method grpo --dataset your-org/your-dataset --output-repo your-org/your-model --compute a100-large
Gate a checkpoint (carl-studio[training]):
carl eval --adapter your-username/your-model
How It Works
┌─────────┐ ┌─────────┐ ┌─────────┐ ┌──────┐ ┌──────┐
│ Observe │ ──> │ Measure │ ──> │ Train │ ──> │ Gate │ ──> │ Ship │
│ │ │ Phi │ │ CARL │ │ │ │ │
└─────────┘ └─────────┘ └─────────┘ └──────┘ └──────┘
point at entropy + task rewards cascade push to
any run order param + coherence auto-fires hub
Observe — Point CARL at a Trackio dashboard or log file. Instantly see Phi trajectory, entropy, phase state, health.
Measure — Phi = 1 - H(P)/log|V|. Zero means maximum uncertainty. One means complete coherence. Computed per token, every step.
Train — Five reward functions in a cascade. Task rewards teach what. CARL rewards teach how coherently.
Gate — The cascade auto-calibrates from the training signal. No hardcoded thresholds. CARL activates only when the model demonstrates sustained capability.
Ship — Eval gate passes → checkpoint pushed to Hub.
CLI Install Matrix
| Workflow | Command | Install |
|---|---|---|
| One-shot observe | carl observe --url ... --run ... |
pip install carl-studio |
| Live observe | carl observe --live ... |
pip install 'carl-studio[tui]' |
| Claude diagnosis | carl observe --diagnose ... |
pip install 'carl-studio[observe]' |
| Local train/eval | carl train, carl eval |
pip install 'carl-studio[training]' |
| HF job management / publish | carl run status, carl run logs, carl run stop, carl push |
pip install 'carl-studio[hf]' |
| Camp account + marketplace | carl camp account, carl camp login, carl camp logout, carl camp credits, carl camp marketplace |
platform features (optional) |
| Privacy consent | carl camp consent show, carl camp consent update |
included |
| x402 payment rail | carl camp x402 configure, carl camp x402 status |
included |
| Contract witnessing | carl camp contract sign, carl camp contract verify |
included |
| Constitutional ledger | carl contract constitution genesis|verify|evaluate|status |
pip install 'carl-studio[constitutional]' |
| Carlito management | carl carlito list, carl carlito spawn, carl carlito show |
included |
Managed tiers build on top of these open workflows; extras control local capabilities, not research access.
Provider credentials unlock provider workflows, not CARL Paid platform access. Use carl camp account to inspect managed account state, credits, and enabled wallet/x402 capabilities. Privacy consent is managed locally with carl camp consent — all flags default off.
Credential Matrix
| Workflow | Auth |
|---|---|
| Local file observe | none |
| Public Trackio observe | none |
| Claude diagnosis / chat | ANTHROPIC_API_KEY or --api-key |
| Hub jobs / push / gated model access | HF_TOKEN or prior HF login |
| RunPod backend | RUNPOD_API_KEY |
Results
Trained with CARL on OmniCoder-9B:
| Metric | Value |
|---|---|
| Task completion | 92% |
| Tool format compliance | 99% |
| Mean tool calls per task | 11.09 |
| Phase 2' eval gate | PASS |
80 GRPO steps. Five reward functions. Self-calibrating cascade gate.
What's new (v0.18.1 · 2026-04-24)
Unified entry-point router + sessions + trust + journey coverage matrix.
- One
carlbinary, four entry modes.carl(REPL),carl "<prompt>"(REPL with first turn),carl -p "<q>"(one-shot, trust-bypass),carl <verb>(Typer dispatch). Router atsrc/carl_studio/cli/entry.py; contract docs atdocs/v18_journey_coverage.md. carl trust— bare-entry trust pre-check.trust status/acknowledge/enable/disable/resetwith prior-root eviction notice; persisted at~/.carl/trust.yaml.carl session list/show/delete— project-aware. Walks up viaproject_context.currentso you can invoke from any subdir of a project.carl init --jsonprobe-only fast-path. Seven stable probe keys (first_run_complete,camp_session,llm_provider_detected,training_extras_healthy,project_config_present,consent_set,context_present). No prompts on piped stdin; contract locked bytests/journeys/test_journeys_v18.py.- Journey matrix. 12 journeys × 4 transitions = 48 transitions, covered by
172 passing tests (164 pre-existing + 8 new journey tests). Batch spec for
parallel UAT execution at
tests/journeys/BATCHES.md.
What's new (v0.9.0) — still applies
EML symbolic witness — third realizability primitive alongside BITC and DMC.
- New reward option:
reward_class="eml". Depth-3 learnable tree, 7 parameters, +0.972 correlation with PhaseAdaptive — a nearly-indistinguishable signal at ~10x parameter efficiency. Benchmarks inscripts/eml_reward_benchmark.md. - Resonants — a new entity class.
carl_core.resonant.Resonant+compose_resonantsenables typed, depth-bounded (MAX_DEPTH=4) composition of reward / policy primitives without ad-hoc schema drift. - Constitutional ledger. New subcommand
carl contract constitution(genesis | verify | evaluate | status) — hash-chained append-only ledger over action features (25-dim encoding). Install via:
pip install 'carl-studio[constitutional]' # pulls pynacl>=1.5
- Public EML paper — see the upstream Observable Computation bundle for
eml-symbolic-witness.md(numerical verification: ln identity max absolute error 4.44e-16 over 990 sample points onx ∈ [0.1, 10)at 0.01 step).
Papers
The math is published and independently reproducible. CARL ships a
four-paper in-repo series under paper/ and cites the
upstream Zenodo work for the conservation law and identity proof.
CARL Methods Series (in-repo, drafts):
paper/01-main-carl.md— Coherence-Aware Reinforcement Learning (main paper)paper/02-phase-adaptive-methods.md— Phase-Adaptive Coherence Rewardspaper/03-coherence-trap-technical-note.md— The Coherence Trap (technical note)paper/04-interaction-chains-witness-logs.md— Interaction Chains as Witness Logs
Index and cross-reference table: docs/paper_series.md.
Upstream foundations (Zenodo):
- Bounded Informational Time Crystals — derives the conservation law
- Material Reality — validates across 6,244 trials
- Semantic Realizability — formal proof
Reference
Architecture, API, CLI commands, environments, compute backends → docs/reference.md
Credential setup and provider auth → docs/auth.md
Changelog
Full history lives in CHANGELOG.md; the most recent entries:
v0.18.1 (2026-04-24) — unified entry-point + journey matrix
- Unified router (
cli/entry.py) picks between REPL / bare-prompt / one-shot (-p) / subcommand. carl trust— bare-entry trust pre-check registry at~/.carl/trust.yaml.carl session— project-aware, walks up viaproject_context.current.carl init --json— probe-only fast-path with 7 stable keys; never prompts on piped stdin.- Journey matrix + BATCHES spec at
tests/journeys/; 172 tests green on v0.18 surface. - Fixture discipline: HOME-pinned tests place the project at
tmp_path/"proj"(home-guard invariant).
v0.7.1 (2026-04-19) — Phase-2b close-out
- x402 spend caps (daily + session) +
confirm_paymenthook. - MCP per-request session state —
_sessionglobal replaced withMCPServerConnection.session; FastMCPContextDI on authenticated tools. carl metrics serve— Prometheus text-format scrape endpoint (metricsextra); heartbeat auto-hosts whenCARL_METRICS_PORTis set.carl run diff <a> <b>— trajectory delta (phi, q_hat, crystallizations) with optional--stepsalignment.- Shared
GatingPredicateProtocol +carl.gate.*error namespace acrossconsent_gateandtier_gate. - Heartbeat maintenance wrapped in
RetryPolicy(max_attempts=3)for transient sqlite/IO. CARL_HOMEenv now honored uniformly (db.py, settings.py, wallet_store.py, llm.py).
Star History
terminals.tech · PyPI · Paper · Docs
MIT — Intuition Labs LLC
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file carl_studio-0.18.1.tar.gz.
File metadata
- Download URL: carl_studio-0.18.1.tar.gz
- Upload date:
- Size: 4.4 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5a7a4070370f47af72e39810f499a1ef51bf5b156c323e1a141276d7318a61b7
|
|
| MD5 |
233e44ddc6f8343021d1257d5843a4b1
|
|
| BLAKE2b-256 |
b2e4c019f78e7096dadf9ee243183eb4233bb8492199267e4abdf4a96fa911ab
|
Provenance
The following attestation bundles were made for carl_studio-0.18.1.tar.gz:
Publisher:
publish.yml on wheattoast11/carl
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
carl_studio-0.18.1.tar.gz -
Subject digest:
5a7a4070370f47af72e39810f499a1ef51bf5b156c323e1a141276d7318a61b7 - Sigstore transparency entry: 1374301616
- Sigstore integration time:
-
Permalink:
wheattoast11/carl@f562bbe3a28e1015180a42a31d995be72ea2e596 -
Branch / Tag:
refs/tags/v0.18.1 - Owner: https://github.com/wheattoast11
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@f562bbe3a28e1015180a42a31d995be72ea2e596 -
Trigger Event:
push
-
Statement type:
File details
Details for the file carl_studio-0.18.1-py3-none-any.whl.
File metadata
- Download URL: carl_studio-0.18.1-py3-none-any.whl
- Upload date:
- Size: 843.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8914e9e6cc7667ab152260e870e5871e63778e9b1705ce2b848f70477a2605e5
|
|
| MD5 |
8bd26a8a856767db4cf89d4408f7023b
|
|
| BLAKE2b-256 |
62399e8bb484fe3c144a52c44563c785d13917f4e8416ed4dcb6f329f34a20a0
|
Provenance
The following attestation bundles were made for carl_studio-0.18.1-py3-none-any.whl:
Publisher:
publish.yml on wheattoast11/carl
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
carl_studio-0.18.1-py3-none-any.whl -
Subject digest:
8914e9e6cc7667ab152260e870e5871e63778e9b1705ce2b848f70477a2605e5 - Sigstore transparency entry: 1374301692
- Sigstore integration time:
-
Permalink:
wheattoast11/carl@f562bbe3a28e1015180a42a31d995be72ea2e596 -
Branch / Tag:
refs/tags/v0.18.1 - Owner: https://github.com/wheattoast11
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@f562bbe3a28e1015180a42a31d995be72ea2e596 -
Trigger Event:
push
-
Statement type: