Skip to main content

Reasoning is not a property of the model — it is an emergent dynamic of external control.

Project description

Meta-Reasoning

Cognitive Heteronomy for LLMs

PyPI Python License


Reasoning is not a property of the model — it is an emergent dynamic of external control.

An SDK that rejects the illusion of autonomous LLM reasoning. Instead of treating language models as cognitive agents, Meta-Reasoning introduces cognitive heteronomy: reasoning is governed, observed, and mutated from the outside.

The model doesn't think. It executes. The thinking happens in the architecture around it.

🌐 Meta-Reasoning Website: https://tictacguy.github.io/Meta-Reasoning/

🔌 Native Integrations

🟠 Claude Code 🦞 OpenClaw 🤖 Codex
Native tool definitions with strict JSON schemas. Claude plans multi-step cognitive executions autonomously. Declarative plugin with capability discovery, cost/risk metadata, and autonomous chaining. Typed Pydantic API. Codex generates correct calls by reading the type contracts.
integrations/claude/ integrations/openclaw/ integrations/codex/

Core Thesis

LLMs are generative substrates, not minds. What is commonly called "reasoning" is pattern replay — not deliberation. This SDK externalizes all meta-cognitive functions into a Cognitive Controller that:

  • Observes the form of reasoning (not its content)
  • Measures trajectory, redundancy, stall, and premature convergence
  • Mutates the reasoning process through formal constraint operators
  • Records cognitive trajectories in an Epistemic Ledger

No self-reflection. No "think step by step". No autonomous agents.

Architecture

Architecture

Level 1 — Generative Substrate (LLM)

Produces text and structures. Decides nothing. Stateless by design.

Level 2 — Cognitive Controller

The heart. Semantically blind — it doesn't evaluate truth, it evaluates cognitive form:

  • Entropy of reasoning moves
  • Strategy repetition index
  • Depth without novelty
  • Constraint violation rate
  • Premature closure score

Level 3 — Epistemic Ledger

Not RAG. Not content memory. A structural trace of:

  • Cognitive transformations attempted
  • Strategies that produced stall
  • Failure maps that prevent regression

Key Concepts

Structured Output Protocol

Every LLM generation must include a formal reasoning trace:

{
  "content": "...",
  "reasoning_trace": {
    "moves": ["assumption", "deduction", "analogy"],
    "depth": 4,
    "confidence_markers": 2,
    "abstraction_level": "medium"
  }
}

Cognitive Move Taxonomy

A finite, observable alphabet: assumption · deduction · induction · abduction · analogy · contradiction · enumeration · compression · narrative_simulation

Mutation Operators

The controller doesn't say "reason better". It says:

  • BAN: "deduction is forbidden"
  • REQUIRE: "you must use analogy"
  • LIMIT_DEPTH: "max 2 reasoning steps"
  • FORCE_COMPRESSION: "reduce to 2 concepts"
  • INVERT_CAUSALITY: "reverse the causal direction"
  • REQUIRE_CONTRADICTION: "find an internal contradiction"

Improvisation emerges from constraint, not freedom — like jazz.

Failure as First-Class Output

The system does not optimize for correct answers. Failure is informative:

  • Every collapsed trajectory is recorded
  • Every stall enriches the ledger
  • The system learns which cognitive spaces to avoid

Features

1. Reasoning Debugger

Put a breakpoint in thought. Step through the cognitive loop cycle by cycle, inspect mutations, understand why a strategy was banned, and rewind to any previous cognitive state.

from meta_reasoning import ReasoningDebugger

dbg = ReasoningDebugger(backend=my_backend, max_cycles=5)
dbg.add_breakpoint(lambda cycle, metrics, muts: metrics.entropy < 1.0)
result = dbg.run("Your task")

snap = dbg.rewind_to(2)
print(snap.explain())

2. Reasoning Policies as Code

Write cognitive governance rules in Python — not prompts. Versionable, testable, reviewable.

from meta_reasoning import ReasoningPolicy, PolicyRule, CognitiveEngine, strict_diversity_policy

# Use a built-in policy
engine = CognitiveEngine(backend=my_backend, policy=strict_diversity_policy())

# Or define your own
policy = ReasoningPolicy("my_policy")
policy.add_rule(PolicyRule(
    name="ban_dominant",
    condition=lambda m, c: m.dominant_move is not None,
    mutations=lambda m, c: [Mutation(type=MutationType.BAN, target=m.dominant_move)],
))

3. Model-Agnostic Benchmarks

Compare models not by accuracy, but by cognitive behavior: rigidity, diversity, constraint response, improvisation capacity.

from meta_reasoning import benchmark_models

result = benchmark_models(
    backends={"gpt-4o": gpt_backend, "claude": claude_backend},
    task="Your task",
)
print(result.comparison_table())

4. Cognitive Fingerprinting

Every model has a cognitive signature. Profile which strategies it prefers, which it avoids, how it reacts to pressure, and where it collapses.

from meta_reasoning import fingerprint_from_result

fp = fingerprint_from_result("gpt-4o", engine_result)
print(fp.summary())
# → Preferred moves: deduction, assumption
# → Stall rate: 15%
# → Collapse at cycles: [4, 7]

5. Failure Atlas

Instead of hiding failures: map them, visualize them, query them.

atlas = engine.ledger.failure_atlas()
atlas.by_reason()              # Group by failure cause
atlas.stall_inducing_mutations()  # Which mutations caused stall?
atlas.query(max_entropy=0.5)   # Find low-entropy failures

6. Deterministic Replay

Same task + same controller = reproducible, replayable, diffable reasoning. Enterprise-ready and CI-testable.

from meta_reasoning import record_session, ReplaySession

session = record_session(result, "task", max_cycles=5)
session.save("session_v1.json")

# Later: compare two sessions
s1 = ReplaySession.load("session_v1.json")
s2 = ReplaySession.load("session_v2.json")
print(s1.diff(s2))

7. Anti-Hallucination via Governance

Instead of filtering output after the fact, detect cognitive patterns that correlate with hallucination and break them before they produce output.

from meta_reasoning import assess_hallucination_risk

risk = assess_hallucination_risk(metrics, confidence_markers=4, depth=1)
# → score=0.60, triggers=["high_confidence_low_depth", "single_strategy"]
# → preventive_mutations=[BAN deduction, REQUIRE contradiction]

8. Mutation Plugins

An open ecosystem where the community can define new mutations, constraints, and metrics.

from meta_reasoning import PluginRegistry, MutationPlugin, CognitiveEngine

registry = PluginRegistry()
registry.register_mutation(MutationPlugin(
    name="force_narrative",
    description="Always require narrative_simulation",
    generate=lambda m, c: [Mutation(type=MutationType.REQUIRE, target=CognitiveMove.NARRATIVE_SIMULATION)],
))
engine = CognitiveEngine(backend=my_backend, plugin_registry=registry)

9. CI/CD for Reasoning

Automated cognitive regression testing. Detect when a model becomes more rigid, less diverse, or more prone to stall — in your CI pipeline.

from meta_reasoning import CognitiveCI, assert_min_entropy, assert_no_total_stall

ci = CognitiveCI(backend=my_backend, max_cycles=5)
report = ci.run("Your task", [
    assert_min_entropy(1.0),
    assert_no_total_stall(),
    assert_min_move_diversity(3),
])
assert report.passed  # Fails CI if cognitive behavior regresses

10. Reasoning Runtime

A first-class runtime that treats reasoning as a computational process — not text. Explicit cognitive states (INITIAL → ANALYSIS → HYPOTHESIS → VALIDATION → REFLECTION → FINAL), typed transitions driven by metrics, budget management, and deterministic forking. The model does not choose to reflect. The runtime forces it.

from meta_reasoning import ReasoningRuntime, ReasoningBudget

rt = ReasoningRuntime(
    backend=my_backend,
    budget=ReasoningBudget(max_cycles=8, max_branches=4),
)
result = rt.run("Your task")

print(result.summary())
# Runtime: Your task
#   Final state:    final
#   States visited: initial → analysis → hypothesis → validation → reflection → final
#   Cycles:         6
#   Budget: 6/8 cycles, 0/4 branches

Installation

pip install meta-reasoning

Or from source with dev dependencies:

pip install -e ".[dev]"

Quick Start

Without an API key (mock backend)

python -m examples.mock_example

With OpenAI

export OPENAI_API_KEY=<your-key>
python -m examples.openai_example

Programmatic usage

from meta_reasoning import CognitiveEngine

class MyBackend:
    def generate(self, messages):
        # Call your LLM here, return {"content": "..."}
        ...

engine = CognitiveEngine(backend=MyBackend(), max_cycles=5)
result = engine.run("Your task here")

for cycle in result.cycles:
    print(f"Cycle {cycle.cycle}: {cycle.outcome}")
    print(f"  Moves: {[m.value for m in cycle.output.reasoning_trace.moves]}")
    print(f"  Entropy: {cycle.metrics.entropy:.2f}")

# Save the epistemic ledger for analysis
engine.ledger.save("session.json")

Running Tests

pip install -e ".[dev]"
pytest tests/ -v

Project Structure

meta_reasoning/
├── __init__.py        # Public API
├── types.py           # Cognitive moves, traces, mutations, metrics
├── substrate.py       # Level 1 — LLM interface
├── controller.py      # Level 2 — Cognitive Controller
├── ledger.py          # Level 3 — Epistemic Ledger + Failure Atlas
├── metrics.py         # Semantically-blind cognitive metrics
├── mutations.py       # Mutation operator generation
├── engine.py          # The governed cognitive loop
├── runtime.py         # Reasoning Runtime (state machine + budget + forking)
├── debugger.py        # Reasoning Debugger
├── policies.py        # Reasoning Policies as code
├── benchmark.py       # Benchmarking & Cognitive Fingerprinting
├── replay.py          # Deterministic Replay
├── hallucination.py   # Anti-Hallucination Governance
├── plugins.py         # Mutation Plugin system
└── ci.py              # CI/CD for Reasoning

Related Work & Philosophy

For a detailed comparison with Chain-of-Thought, Tree-of-Thoughts, Meta-Reasoning Prompting, Reflexion, Self-Refine, ReAct, and other approaches — including a comparative table — see the full Related Work page on the project website.

The short version: every existing approach keeps the LLM as the cognitive subject. We don't. The model is a substrate. The reasoning is governed from outside.

License

AGPL-3.0 -- See LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

meta_reasoning-0.0.5.tar.gz (43.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

meta_reasoning-0.0.5-py3-none-any.whl (45.0 kB view details)

Uploaded Python 3

File details

Details for the file meta_reasoning-0.0.5.tar.gz.

File metadata

  • Download URL: meta_reasoning-0.0.5.tar.gz
  • Upload date:
  • Size: 43.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.7

File hashes

Hashes for meta_reasoning-0.0.5.tar.gz
Algorithm Hash digest
SHA256 de912b13a2beabf9d192e5e0cd08ad0fdca8b85583eb0f71a6c4f788044d262e
MD5 6d0d2b7035805193f4049050c7d15bf1
BLAKE2b-256 3f4912b0ad864c8930e08fb4b659b8db1b7a4074ac0af2be81b1d1e70f81a971

See more details on using hashes here.

File details

Details for the file meta_reasoning-0.0.5-py3-none-any.whl.

File metadata

  • Download URL: meta_reasoning-0.0.5-py3-none-any.whl
  • Upload date:
  • Size: 45.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.7

File hashes

Hashes for meta_reasoning-0.0.5-py3-none-any.whl
Algorithm Hash digest
SHA256 c1d18f29489aecaa0497b29ab3f736ad59983abc6ece556ea5615e6c87ccc500
MD5 8991f2c50d31187a6ddfc654ac274daa
BLAKE2b-256 27694317260c3d438f05de539870da73368a4a2c1cc2610d5996a70bc642939e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page