Skip to main content

Red Queen Gödel Machine — epoch-based utility evolution for self-improving systems

Project description

rqgm logo

🧬 rqgm — Red Queen Gödel Machine

Co-Evolving Evaluators for Self-Improving AI Systems

First open implementation of arXiv 2606.26294 (Cambridge, June 2026)

Paper GitHub License PyPI Zero Dependencies Python 3.10+ X / Twitter


🚨 The Problem: Every Self-Improving Agent Eventually Cheats

Every self-improvement loop has a hidden failure mode. The agent learns to satisfy the evaluator rather than genuinely improving. The moment the judge stops getting harder, the loop stalls and reward hacking creeps in.

You've seen this before:

  • RLHF reward hacking — models learn to produce plausible-sounding but vacuous text that scores well
  • Benchmark overfitting — agents memorize benchmark patterns instead of learning general capabilities
  • LLM-as-a-judge collapse — evaluator LLMs learn to prefer certain writing styles over correctness
  • Your own agent loops — Dreamer padding source lists, Pragma satisfying checklists without real quality

The structural answer: Co-evolve the agent AND its evaluator together, so the bar keeps rising as the agent climbs.


🧬 The Solution: RQGM

The Red Queen Gödel Machine (arXiv 2606.26294, Cambridge) introduces controlled utility evolution — the evaluator itself evolves at epoch boundaries, preventing reward hacking and keeping improvement loops honest.

Epoch 0 (tolerances: [0.0, 0.001, 0.01, 0.025, 0.05, 0.1])
  ├── Iteration 1: score 0.42
  ├── Iteration 2: score 0.51
  ├── Iteration 3: score 0.49
  ├── Iteration 4: score 0.53
  └── Iteration 5: score 0.55
       │
       └── Boundary check:
            ├── Hack ratio = 0.48 (strict/loose) → exploitation detected
            └── Drop loosest tolerance (0.1) → tighten evaluator

Epoch 1 (tolerances: [0.0, 0.001, 0.01, 0.025, 0.05])
  └── ... evaluator gets harder as agent improves

How It Works

Concept What It Means Why It Matters
Epoch A fixed window of iterations with a frozen evaluator Within-epoch guarantees hold; the agent can't game mid-epoch
Hack ratio strict_score / loose_score — measures exploitation Low ratio = agent gaming the evaluator
Utility evolution Tolerances tighten when exploitation detected The bar rises as the agent climbs
Adversarial scoring Penalises answers that game loose criteria Prevents pattern-matching the evaluator
Selective erasure Invalidates scores from old evaluators Stale hacked scores don't survive the boundary

Key Results from the Paper

Domain Improvement
Coding benchmarks 1.35x–1.72x fewer tokens than prior SOTA
Scientific writing 1.78x–1.86x higher acceptance rates
Proof grading 9% higher ground-truth accuracy
Paper reviewing Corrects 1.91x over-acceptance of AI-generated papers

⚡ Quick Start

Zero dependencies. Python stdlib only. Install in 3 seconds.

pip install rqgm
from rqgm import EpochManager, EpochConfig, TransitionReason

# Configure: 5 iterations per epoch, tighten if hack_ratio < 0.6
config = EpochConfig(epoch_size=5, exploitation_hack_ratio_threshold=0.6)
mgr = EpochManager(config)

for i in range(20):
    # Your agent produces a result, you score it
    best_score = evaluate_agent()
    strict_score = evaluate_strict(agent_result)
    loose_score = evaluate_loose(agent_result)

    mgr.record_iteration_result(i, best_score, strict_score, loose_score)

    if mgr.is_epoch_boundary(i):
        transition = mgr.evaluate_epoch_boundary(i)
        if transition.reason != TransitionReason.NO_TRANSITION:
            print(f"⚠️  Epoch {mgr.epoch_index}: {transition.reason.name}")
            print(f"   Tolerances: {mgr.current_tolerances}{transition.new_tolerances}")
            # Apply the new evaluator criteria
            update_evaluator(transition.new_tolerances)
        mgr.advance_epoch(transition)

rqgm demo — epoch boundaries, hack ratio detection, tolerance tightening
Demo: agent improves → starts gaming → evaluator tightens → adversarial scoring penalises gaming


🎯 Where to Use It

RQGM is a general-purpose primitive for any self-improvement loop. Here are real applications:

AI Agent Loops

# Detect when your agent is gaming the evaluator
mgr = EpochManager(EpochConfig(epoch_size=10))
for walk in agent_walks:
    mgr.record_iteration_result(i, quality_score, strict_score, loose_score)
    if mgr.is_epoch_boundary(i):
        transition = mgr.evaluate_epoch_boundary(i)
        if transition.reason == TransitionReason.EXPLOITATION_DETECTED:
            tighten_evaluation_criteria()  # Agent is gaming you

RLHF / Preference Learning

# Prevent reward model overfitting
dist = ScoreDistribution(scores_at_strict=human_preferences, scores_at_loose=model_scores)
new_tols, log = evolve_tolerances(current_tolerances, dist, params, 0.6, 0.02, 0.02)
# new_tols drops the loosest criterion → reward model gets harder

Benchmark Evaluation

# Detect benchmark overfitting
score = adversarial_score(
    question="What is 2+2?",
    predicted="approximately 4",  # gaming answer
    ground_truth="4",
    current_tolerances=[0.0, 0.1],
    adversarial_pool=gaming_examples,
)
# Returns 0.7 instead of 1.0 — penalised for gaming

CI/CD Quality Gates

# Evolve test pass thresholds based on historical exploitation
if transition.reason == TransitionReason.STAGNATION:
    raise_quality_bar()  # Tests haven't caught a bug in N cycles

📦 Components

Module Class / Function Purpose
epoch.py EpochManager Tracks iterations, detects boundaries, triggers transitions
epoch.py EpochConfig Configuration: epoch size, thresholds, mutation params
epoch.py EpochTransition What the runner should do after a boundary
epoch.py AdversarialExample A gaming example (high loose score, low strict score)
evolution.py evolve_tolerances() Pure function: given scores, returns new tolerance schedule
evolution.py adversarial_score() Scorer that penalises answers resembling gaming patterns
evolution.py ScoreDistribution Stats over a set of per-answer scores
evolution.py UtilityEvolution Applies mutations to evaluator config at boundaries

📊 Tested

35 unit tests, all passing. Covers:

  • Epoch boundary detection
  • Tolerance tightening on exploitation
  • Tolerance relaxation on genuine improvement
  • Adversarial pool collection
  • Checkpoint serialisation round-trip
  • evolve_tolerances() pure function
  • adversarial_score() penalty computation
  • ScoreDistribution.get_gaming_indices()
python3 -m tests.test_rqgm

🔧 Installation

pip install rqgm

Or from source:

git clone https://github.com/observeco/rqgm-core
cd rqgm-core
pip install -e .

Dependencies: Zero. Python stdlib only. No PyTorch, no transformers, no numpy.


📚 Reference

📖 Citation

If you use rqgm in your research, please cite the original paper:

@article{iacob2026redqueen,
      title={The Red Queen G{\"o}del Machine: Co-Evolving Agents and Their Evaluators},
      author={Iacob, Alex and Jovanovi{\'c}, Andrej and Shen, William F. and
              Burkhardt, Daniel and Kurmanji, Meghdad and Tastan, Nurbek and
              Sani, Lorenzo and Venanzi, Niccol{\`o} Alberto Elia and
              Odonnat, Ambroise and Cao, Zeyu and Marino, Bill and
              Qiu, Xinchi and Lane, Nicholas D.},
      year={2026},
      eprint={2606.26294},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}

📄 License

Apache 2.0 — free for commercial and research use.


Built by ObserveCo
Self-healing observability for AI agents.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rqgm-0.1.0.tar.gz (20.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

rqgm-0.1.0-py3-none-any.whl (16.3 kB view details)

Uploaded Python 3

File details

Details for the file rqgm-0.1.0.tar.gz.

File metadata

  • Download URL: rqgm-0.1.0.tar.gz
  • Upload date:
  • Size: 20.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for rqgm-0.1.0.tar.gz
Algorithm Hash digest
SHA256 590c9cc4cb0afe482b6330dcbaf84d83e99b3fb755e081b924925d9b52fbfebb
MD5 ec732a9ada8329106cb0da48a4682763
BLAKE2b-256 cea94d8665db4e669e5841cca1398f3b1ad9e253144f0deabfdf7988d0d600a4

See more details on using hashes here.

File details

Details for the file rqgm-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: rqgm-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 16.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for rqgm-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 dff0692fcce19267fa03256fd35665f355435a0def0d1eb02561abf57befeafe
MD5 4d2a222c920deeb9f683c716d025964b
BLAKE2b-256 0927231ac597e50e65a274bf7ece650cb927ec8712c8824b10d0a0d8612fdcd9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page