Red Queen Gödel Machine — epoch-based utility evolution for self-improving systems
Project description
🧬 rqgm — Red Queen Gödel Machine
Co-Evolving Evaluators for Self-Improving AI Systems
First open implementation of arXiv 2606.26294 (Cambridge, June 2026)
🚨 The Problem: Every Self-Improving Agent Eventually Cheats
Every self-improvement loop has a hidden failure mode. The agent learns to satisfy the evaluator rather than genuinely improving. The moment the judge stops getting harder, the loop stalls and reward hacking creeps in.
You've seen this before:
- RLHF reward hacking — models learn to produce plausible-sounding but vacuous text that scores well
- Benchmark overfitting — agents memorize benchmark patterns instead of learning general capabilities
- LLM-as-a-judge collapse — evaluator LLMs learn to prefer certain writing styles over correctness
- Your own agent loops — Dreamer padding source lists, Pragma satisfying checklists without real quality
The structural answer: Co-evolve the agent AND its evaluator together, so the bar keeps rising as the agent climbs.
🧬 The Solution: RQGM
The Red Queen Gödel Machine (arXiv 2606.26294, Cambridge) introduces controlled utility evolution — the evaluator itself evolves at epoch boundaries, preventing reward hacking and keeping improvement loops honest.
Epoch 0 (tolerances: [0.0, 0.001, 0.01, 0.025, 0.05, 0.1])
├── Iteration 1: score 0.42
├── Iteration 2: score 0.51
├── Iteration 3: score 0.49
├── Iteration 4: score 0.53
└── Iteration 5: score 0.55
│
└── Boundary check:
├── Hack ratio = 0.48 (strict/loose) → exploitation detected
└── Drop loosest tolerance (0.1) → tighten evaluator
Epoch 1 (tolerances: [0.0, 0.001, 0.01, 0.025, 0.05])
└── ... evaluator gets harder as agent improves
How It Works
| Concept | What It Means | Why It Matters |
|---|---|---|
| Epoch | A fixed window of iterations with a frozen evaluator | Within-epoch guarantees hold; the agent can't game mid-epoch |
| Hack ratio | strict_score / loose_score — measures exploitation |
Low ratio = agent gaming the evaluator |
| Utility evolution | Tolerances tighten when exploitation detected | The bar rises as the agent climbs |
| Adversarial scoring | Penalises answers that game loose criteria | Prevents pattern-matching the evaluator |
| Selective erasure | Invalidates scores from old evaluators | Stale hacked scores don't survive the boundary |
Key Results from the Paper
| Domain | Improvement |
|---|---|
| Coding benchmarks | 1.35x–1.72x fewer tokens than prior SOTA |
| Scientific writing | 1.78x–1.86x higher acceptance rates |
| Proof grading | 9% higher ground-truth accuracy |
| Paper reviewing | Corrects 1.91x over-acceptance of AI-generated papers |
⚡ Quick Start
Zero dependencies. Python stdlib only. Install in 3 seconds.
pip install rqgm
from rqgm import EpochManager, EpochConfig, TransitionReason
# Configure: 5 iterations per epoch, tighten if hack_ratio < 0.6
config = EpochConfig(epoch_size=5, exploitation_hack_ratio_threshold=0.6)
mgr = EpochManager(config)
for i in range(20):
# Your agent produces a result, you score it
best_score = evaluate_agent()
strict_score = evaluate_strict(agent_result)
loose_score = evaluate_loose(agent_result)
mgr.record_iteration_result(i, best_score, strict_score, loose_score)
if mgr.is_epoch_boundary(i):
transition = mgr.evaluate_epoch_boundary(i)
if transition.reason != TransitionReason.NO_TRANSITION:
print(f"⚠️ Epoch {mgr.epoch_index}: {transition.reason.name}")
print(f" Tolerances: {mgr.current_tolerances} → {transition.new_tolerances}")
# Apply the new evaluator criteria
update_evaluator(transition.new_tolerances)
mgr.advance_epoch(transition)
Demo: agent improves → starts gaming → evaluator tightens → adversarial scoring penalises gaming
🎯 Where to Use It
RQGM is a general-purpose primitive for any self-improvement loop. Here are real applications:
AI Agent Loops
# Detect when your agent is gaming the evaluator
mgr = EpochManager(EpochConfig(epoch_size=10))
for walk in agent_walks:
mgr.record_iteration_result(i, quality_score, strict_score, loose_score)
if mgr.is_epoch_boundary(i):
transition = mgr.evaluate_epoch_boundary(i)
if transition.reason == TransitionReason.EXPLOITATION_DETECTED:
tighten_evaluation_criteria() # Agent is gaming you
RLHF / Preference Learning
# Prevent reward model overfitting
dist = ScoreDistribution(scores_at_strict=human_preferences, scores_at_loose=model_scores)
new_tols, log = evolve_tolerances(current_tolerances, dist, params, 0.6, 0.02, 0.02)
# new_tols drops the loosest criterion → reward model gets harder
Benchmark Evaluation
# Detect benchmark overfitting
score = adversarial_score(
question="What is 2+2?",
predicted="approximately 4", # gaming answer
ground_truth="4",
current_tolerances=[0.0, 0.1],
adversarial_pool=gaming_examples,
)
# Returns 0.7 instead of 1.0 — penalised for gaming
CI/CD Quality Gates
# Evolve test pass thresholds based on historical exploitation
if transition.reason == TransitionReason.STAGNATION:
raise_quality_bar() # Tests haven't caught a bug in N cycles
📦 Components
| Module | Class / Function | Purpose |
|---|---|---|
epoch.py |
EpochManager |
Tracks iterations, detects boundaries, triggers transitions |
epoch.py |
EpochConfig |
Configuration: epoch size, thresholds, mutation params |
epoch.py |
EpochTransition |
What the runner should do after a boundary |
epoch.py |
AdversarialExample |
A gaming example (high loose score, low strict score) |
evolution.py |
evolve_tolerances() |
Pure function: given scores, returns new tolerance schedule |
evolution.py |
adversarial_score() |
Scorer that penalises answers resembling gaming patterns |
evolution.py |
ScoreDistribution |
Stats over a set of per-answer scores |
evolution.py |
UtilityEvolution |
Applies mutations to evaluator config at boundaries |
📊 Tested
35 unit tests, all passing. Covers:
- Epoch boundary detection
- Tolerance tightening on exploitation
- Tolerance relaxation on genuine improvement
- Adversarial pool collection
- Checkpoint serialisation round-trip
evolve_tolerances()pure functionadversarial_score()penalty computationScoreDistribution.get_gaming_indices()
python3 -m tests.test_rqgm
🔧 Installation
pip install rqgm
Or from source:
git clone https://github.com/observeco/rqgm-core
cd rqgm-core
pip install -e .
Dependencies: Zero. Python stdlib only. No PyTorch, no transformers, no numpy.
📚 Reference
- The Red Queen Gödel Machine: Co-Evolving Agents and Their Evaluators — Iacob et al., University of Cambridge, June 2026
- EvoSkill-RQGM — Full integration with EvoSkill's self-improvement loop (first open RQGM implementation)
📖 Citation
If you use rqgm in your research, please cite the original paper:
@article{iacob2026redqueen,
title={The Red Queen G{\"o}del Machine: Co-Evolving Agents and Their Evaluators},
author={Iacob, Alex and Jovanovi{\'c}, Andrej and Shen, William F. and
Burkhardt, Daniel and Kurmanji, Meghdad and Tastan, Nurbek and
Sani, Lorenzo and Venanzi, Niccol{\`o} Alberto Elia and
Odonnat, Ambroise and Cao, Zeyu and Marino, Bill and
Qiu, Xinchi and Lane, Nicholas D.},
year={2026},
eprint={2606.26294},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
📄 License
Apache 2.0 — free for commercial and research use.
Built by ObserveCo
Self-healing observability for AI agents.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file rqgm-0.1.0.tar.gz.
File metadata
- Download URL: rqgm-0.1.0.tar.gz
- Upload date:
- Size: 20.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
590c9cc4cb0afe482b6330dcbaf84d83e99b3fb755e081b924925d9b52fbfebb
|
|
| MD5 |
ec732a9ada8329106cb0da48a4682763
|
|
| BLAKE2b-256 |
cea94d8665db4e669e5841cca1398f3b1ad9e253144f0deabfdf7988d0d600a4
|
File details
Details for the file rqgm-0.1.0-py3-none-any.whl.
File metadata
- Download URL: rqgm-0.1.0-py3-none-any.whl
- Upload date:
- Size: 16.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
dff0692fcce19267fa03256fd35665f355435a0def0d1eb02561abf57befeafe
|
|
| MD5 |
4d2a222c920deeb9f683c716d025964b
|
|
| BLAKE2b-256 |
0927231ac597e50e65a274bf7ece650cb927ec8712c8824b10d0a0d8612fdcd9
|