Skip to main content

CARL (Coherence-Aware Reinforcement Learning) — information-theoretic reward signals for LLM training via token-level probability distributions

Project description

CARL: from chaos to crystal

CARL

Coherence-Aware Reinforcement Learning

PyPI Python License Paper


Why

A model becomes an agent when it stops pattern-matching and starts knowing. That transition isn't gradual — it's a phase transition, like water becoming ice. One moment the model is guessing. The next, it's coherent.

Standard training can't see this happening. You watch a loss curve and hope.

CARL measures the moment of crystallization — and rewards it.

                         Phi (order parameter)
                              │
          guessing            │         knowing
     ░░░░░░░░░░░░░░░░░░░░░░░░│████████████████████████
                              │
                        crystallization

The order parameter Phi measures how coherent a model's probability field is at every token. When Phi crystallizes, the model has found its internal anchor — a fixed point it can navigate from to any concept space without losing itself.

This is alignment you can measure, not just evaluate.


Quick start

Measure coherence on any logits distribution — no training, no GPU, no API key. Pure numpy:

from carl_core import CoherenceProbe, KAPPA, SIGMA
import numpy as np

vocab_size = 32_000
probe = CoherenceProbe(vocab_size=vocab_size)

# Any [T, V] logits + [T] chosen tokens. Here: 16 tokens from a 32k vocab.
logits = np.random.randn(16, vocab_size)
token_ids = np.argmax(logits, axis=-1)

snap = probe.measure(logits, token_ids)
print(f"phi_mean = {snap.phi_mean:.3f}   (crystallization target: ≥ {SIGMA})")
print(f"horizon  = KAPPA·d ≈ {int(KAPPA * vocab_size):,} tokens")

Install (just the observables layer):

pip install carl-studio

That gives you carl-core + the base CLI + one-shot observe. For training + HF + Claude observability:

pip install 'carl-studio[quickstart]'

Full extras matrix, reproducible installs via uv.lock, and conflict rules (e.g. wallet vs x402) live in docs/INSTALL.md.

CLI quickstart

carl init                  # one-shot setup: account, provider, extras, project, consent
carl chat                  # agent — interactive loop
carl ask "train a small model on gsm8k"   # agent — one-shot prompt
carl research search "coherence-aware reinforcement learning"

carl init is idempotent: re-running it after setup does nothing unless you pass --force. A first-run marker lives at ~/.carl/.initialized.

Bare carl is an entry surface, not a documented top-level workflow by itself:

  • on a TTY, first run can route into carl init, and a configured project can route into chat
  • on non-TTY input, bare carl prints help plus a nudge toward carl chat and carl ask

Auth

CARL Studio does not require a .env, and it does not auto-load one.

  • Hugging Face workflows work with either HF_TOKEN or a prior hf auth login / huggingface-cli login
  • Claude-powered features use ANTHROPIC_API_KEY or --api-key
  • RunPod uses RUNPOD_API_KEY
  • public Trackio observe works without credentials

If you want a template, copy .env.example and load it into your shell before running carl:

cp .env.example .env
set -a
source .env
set +a

Quick setup:

hf auth login
export ANTHROPIC_API_KEY=sk-ant-xxx   # only for --diagnose / chat
carl start

Full auth details: docs/auth.md

Primary commands

Command What it does
carl init One-shot setup: account, provider, extras, project, consent.
carl chat Interactive agent loop with tools, sessions, cost tracking.
carl ask "<prompt>" One-shot agent invocation.
carl research search "<query>" Search and retrieve research papers (carl-studio[research]).
carl flow "/a /b /c" Chain named operations, emit a shared interaction trace.
carl doctor Readiness audit. Prints blocking issues and freshness findings.
carl train Local training with coherence rewards (carl-studio[training]).

Run carl start --inventory for the full installed command map, or carl flow --list for every chainable op.

Architecture

  • carl-core — primitive layer. Typed errors, retry/backoff, safepath sandboxing, content hashing, tier gating, coherence math, interaction chains. Zero training deps.
  • carl-studio — the CLI, agent loop, training pipeline, MCP server, camp client, eval sandbox. Everything above builds on carl-core.

carl-core is installed alongside carl-studio; public callers import from carl_core.* directly. The legacy carl_studio.primitives shim was removed after v0.5.0.

Error contract

Fatal paths raise carl_core.errors.CARLError subclasses with stable codes you can match programmatically. Top codes:

Code Meaning
carl.error Base class. Generic failure.
carl.config Invalid or missing configuration.
carl.validation Input failed schema / value validation.
carl.credential Missing or expired credential.
carl.network Transient or persistent network failure.
carl.budget Spend cap exceeded.
carl.permission Permission / consent gate failed.
carl.timeout Operation exceeded its deadline.
carl.freshness.stale_pkg Installed package older than recommended floor.
carl.freshness.camp_session_expired carl.camp session needs carl camp login.
carl.eml.depth_exceeded EML tree exceeded depth bound.
carl.eml.domain_error EML operator applied outside its valid domain.
carl.eml.decode_error EML canonical-encoding decode failed.
carl.eml.signature_mismatch Signed EML head failed HMAC verification.

CARLError.to_dict() produces a secrets-redacted, telemetry-safe payload. See packages/carl-core/src/carl_core/errors.py for the full hierarchy.

Use

See inside a Trackio run (no GPU required, base install):

carl observe --url https://your-trackio-space.hf.space/ --run your-run

If the dashboard contains multiple projects, add --project your-project.

Train with coherence rewards (carl-studio[training]):

carl project init
carl train --config carl.yaml
carl run list

Or run directly from the CLI:

carl train --model your-org/your-base-model --method grpo --dataset your-org/your-dataset --output-repo your-org/your-model --compute a100-large

Gate a checkpoint (carl-studio[training]):

carl eval --adapter your-username/your-model

How It Works

 ┌─────────┐     ┌─────────┐     ┌─────────┐     ┌──────┐     ┌──────┐
 │ Observe │ ──> │ Measure │ ──> │  Train  │ ──> │ Gate │ ──> │ Ship │
 │         │     │   Phi   │     │  CARL   │     │      │     │      │
 └─────────┘     └─────────┘     └─────────┘     └──────┘     └──────┘
  point at        entropy +       task rewards     cascade      push to
  any run         order param     + coherence      auto-fires   hub

Observe — Point CARL at a Trackio dashboard or log file. Instantly see Phi trajectory, entropy, phase state, health.

Measure — Phi = 1 - H(P)/log|V|. Zero means maximum uncertainty. One means complete coherence. Computed per token, every step.

Train — Five reward functions in a cascade. Task rewards teach what. CARL rewards teach how coherently.

Gate — The cascade auto-calibrates from the training signal. No hardcoded thresholds. CARL activates only when the model demonstrates sustained capability.

Ship — Eval gate passes → checkpoint pushed to Hub.


CLI Install Matrix

Workflow Command Install
One-shot observe carl observe --url ... --run ... pip install carl-studio
Live observe carl observe --live ... pip install 'carl-studio[tui]'
Claude diagnosis carl observe --diagnose ... pip install 'carl-studio[observe]'
Local train/eval carl train, carl eval pip install 'carl-studio[training]'
HF job management / publish carl run status, carl run logs, carl run stop, carl push pip install 'carl-studio[hf]'
Camp account + marketplace carl camp account, carl camp login, carl camp logout, carl camp credits, carl camp marketplace platform features (optional)
Privacy consent carl camp consent show, carl camp consent update included
x402 payment rail carl camp x402 configure, carl camp x402 status included
Contract witnessing carl camp contract sign, carl camp contract verify included
Constitutional ledger carl contract constitution genesis|verify|evaluate|status pip install 'carl-studio[constitutional]'
Carlito management carl carlito list, carl carlito spawn, carl carlito show included

Managed tiers build on top of these open workflows; extras control local capabilities, not research access.

Provider credentials unlock provider workflows, not CARL Paid platform access. Use carl camp account to inspect managed account state, credits, and enabled wallet/x402 capabilities. Privacy consent is managed locally with carl camp consent — all flags default off.

Credential Matrix

Workflow Auth
Local file observe none
Public Trackio observe none
Claude diagnosis / chat ANTHROPIC_API_KEY or --api-key
Hub jobs / push / gated model access HF_TOKEN or prior HF login
RunPod backend RUNPOD_API_KEY

Results

Trained with CARL on OmniCoder-9B:

Metric Value
Task completion 92%
Tool format compliance 99%
Mean tool calls per task 11.09
Phase 2' eval gate PASS

80 GRPO steps. Five reward functions. Self-calibrating cascade gate.


What's new (v0.18.1 · 2026-04-24)

Unified entry-point router + sessions + trust + journey coverage matrix.

  • One carl binary, four entry modes. carl (REPL), carl "<prompt>" (REPL with first turn), carl -p "<q>" (one-shot, trust-bypass), carl <verb> (Typer dispatch). Router at src/carl_studio/cli/entry.py; contract docs at docs/v18_journey_coverage.md.
  • carl trust — bare-entry trust pre-check. trust status/acknowledge/enable/disable/reset with prior-root eviction notice; persisted at ~/.carl/trust.yaml.
  • carl session list/show/delete — project-aware. Walks up via project_context.current so you can invoke from any subdir of a project.
  • carl init --json probe-only fast-path. Seven stable probe keys (first_run_complete, camp_session, llm_provider_detected, training_extras_healthy, project_config_present, consent_set, context_present). No prompts on piped stdin; contract locked by tests/journeys/test_journeys_v18.py.
  • Journey matrix. 12 journeys × 4 transitions = 48 transitions, covered by 172 passing tests (164 pre-existing + 8 new journey tests). Batch spec for parallel UAT execution at tests/journeys/BATCHES.md.

What's new (v0.9.0) — still applies

EML symbolic witness — third realizability primitive alongside BITC and DMC.

  • New reward option: reward_class="eml". Depth-3 learnable tree, 7 parameters, +0.972 correlation with PhaseAdaptive — a nearly-indistinguishable signal at ~10x parameter efficiency. Benchmarks in scripts/eml_reward_benchmark.md.
  • Resonants — a new entity class. carl_core.resonant.Resonant + compose_resonants enables typed, depth-bounded (MAX_DEPTH=4) composition of reward / policy primitives without ad-hoc schema drift.
  • Constitutional ledger. New subcommand carl contract constitution (genesis | verify | evaluate | status) — hash-chained append-only ledger over action features (25-dim encoding). Install via:
pip install 'carl-studio[constitutional]'   # pulls pynacl>=1.5
  • Public EML paper — see the upstream Observable Computation bundle for eml-symbolic-witness.md (numerical verification: ln identity max absolute error 4.44e-16 over 990 sample points on x ∈ [0.1, 10) at 0.01 step).

Papers

The math is published and independently reproducible. CARL ships a four-paper in-repo series under paper/ and cites the upstream Zenodo work for the conservation law and identity proof.

CARL Methods Series (in-repo, drafts):

Index and cross-reference table: docs/paper_series.md.

Upstream foundations (Zenodo):


Reference

Architecture, API, CLI commands, environments, compute backends → docs/reference.md

Credential setup and provider auth → docs/auth.md


Changelog

Full history lives in CHANGELOG.md; the most recent entries:

v0.18.1 (2026-04-24) — unified entry-point + journey matrix

  • Unified router (cli/entry.py) picks between REPL / bare-prompt / one-shot (-p) / subcommand.
  • carl trust — bare-entry trust pre-check registry at ~/.carl/trust.yaml.
  • carl session — project-aware, walks up via project_context.current.
  • carl init --json — probe-only fast-path with 7 stable keys; never prompts on piped stdin.
  • Journey matrix + BATCHES spec at tests/journeys/; 172 tests green on v0.18 surface.
  • Fixture discipline: HOME-pinned tests place the project at tmp_path/"proj" (home-guard invariant).

v0.7.1 (2026-04-19) — Phase-2b close-out

  • x402 spend caps (daily + session) + confirm_payment hook.
  • MCP per-request session state — _session global replaced with MCPServerConnection.session; FastMCP Context DI on authenticated tools.
  • carl metrics serve — Prometheus text-format scrape endpoint (metrics extra); heartbeat auto-hosts when CARL_METRICS_PORT is set.
  • carl run diff <a> <b> — trajectory delta (phi, q_hat, crystallizations) with optional --steps alignment.
  • Shared GatingPredicate Protocol + carl.gate.* error namespace across consent_gate and tier_gate.
  • Heartbeat maintenance wrapped in RetryPolicy(max_attempts=3) for transient sqlite/IO.
  • CARL_HOME env now honored uniformly (db.py, settings.py, wallet_store.py, llm.py).

Star History

Star History

terminals.tech · PyPI · Paper · Docs

MIT — Intuition Labs LLC

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

carl_studio-0.18.2.tar.gz (4.4 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

carl_studio-0.18.2-py3-none-any.whl (844.0 kB view details)

Uploaded Python 3

File details

Details for the file carl_studio-0.18.2.tar.gz.

File metadata

  • Download URL: carl_studio-0.18.2.tar.gz
  • Upload date:
  • Size: 4.4 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for carl_studio-0.18.2.tar.gz
Algorithm Hash digest
SHA256 14b9e5883e71a7ae3560bf00afad54673903c1bcc67b78833923a1959ce4c372
MD5 e3aaa3f681619bfccd5badf18a07a8f7
BLAKE2b-256 7706c1b7ca01f646a4b73d83699b3940fb29aca8f6cb978906e449d0e2400eb6

See more details on using hashes here.

Provenance

The following attestation bundles were made for carl_studio-0.18.2.tar.gz:

Publisher: publish.yml on wheattoast11/carl

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file carl_studio-0.18.2-py3-none-any.whl.

File metadata

  • Download URL: carl_studio-0.18.2-py3-none-any.whl
  • Upload date:
  • Size: 844.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for carl_studio-0.18.2-py3-none-any.whl
Algorithm Hash digest
SHA256 c577da85f1011fd67d47dc320946aa89ade3fb5ee81b8e8c73c2250526fd6c14
MD5 3fd83fdbf961dbc17dddbd25852b9195
BLAKE2b-256 5243b3c05b6165b3b9f20d820884966eb804207911a95dd4e5fc9ebd8163fdac

See more details on using hashes here.

Provenance

The following attestation bundles were made for carl_studio-0.18.2-py3-none-any.whl:

Publisher: publish.yml on wheattoast11/carl

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page