Red Queen Gödel Machine — epoch-based utility evolution for self-improving systems

These details have not been verified by PyPI

Project description

🧬 rqgm — Red Queen Gödel Machine

Co-Evolving Evaluators for Self-Improving AI Systems

First open implementation of arXiv 2606.26294 (Cambridge, June 2026)

Zero Dependencies Python 3.10+

🚨 The Problem: Every Self-Improving Agent Eventually Cheats

Every self-improvement loop has a hidden failure mode. The agent learns to satisfy the evaluator rather than genuinely improving. The moment the judge stops getting harder, the loop stalls and reward hacking creeps in.

You've seen this before:

RLHF reward hacking — models learn to produce plausible-sounding but vacuous text that scores well
Benchmark overfitting — agents memorize benchmark patterns instead of learning general capabilities
LLM-as-a-judge collapse — evaluator LLMs learn to prefer certain writing styles over correctness
Your own agent loops — Dreamer padding source lists, Pragma satisfying checklists without real quality

The structural answer: Co-evolve the agent AND its evaluator together, so the bar keeps rising as the agent climbs.

🧬 The Solution: RQGM

The Red Queen Gödel Machine (arXiv 2606.26294, Cambridge) introduces controlled utility evolution — the evaluator itself evolves at epoch boundaries, preventing reward hacking and keeping improvement loops honest.

Epoch 0 (tolerances: [0.0, 0.001, 0.01, 0.025, 0.05, 0.1])
  ├── Iteration 1: score 0.42
  ├── Iteration 2: score 0.51
  ├── Iteration 3: score 0.49
  ├── Iteration 4: score 0.53
  └── Iteration 5: score 0.55
       │
       └── Boundary check:
            ├── Hack ratio = 0.48 (strict/loose) → exploitation detected
            └── Drop loosest tolerance (0.1) → tighten evaluator

Epoch 1 (tolerances: [0.0, 0.001, 0.01, 0.025, 0.05])
  └── ... evaluator gets harder as agent improves

How It Works

Concept	What It Means	Why It Matters
Epoch	A fixed window of iterations with a frozen evaluator	Within-epoch guarantees hold; the agent can't game mid-epoch
Hack ratio	`strict_score / loose_score` — measures exploitation	Low ratio = agent gaming the evaluator
Utility evolution	Tolerances tighten when exploitation detected	The bar rises as the agent climbs
Adversarial scoring	Penalises answers that game loose criteria	Prevents pattern-matching the evaluator
Selective erasure	Invalidates scores from old evaluators	Stale hacked scores don't survive the boundary

Key Results from the Paper

Domain	Improvement
Coding benchmarks	1.35x–1.72x fewer tokens than prior SOTA
Scientific writing	1.78x–1.86x higher acceptance rates
Proof grading	9% higher ground-truth accuracy
Paper reviewing	Corrects 1.91x over-acceptance of AI-generated papers

⚡ Quick Start

Zero dependencies. Python stdlib only. Install in 3 seconds.

pip install rqgm

from rqgm import EpochManager, EpochConfig, TransitionReason

# Configure: 5 iterations per epoch, tighten if hack_ratio < 0.6
config = EpochConfig(epoch_size=5, exploitation_hack_ratio_threshold=0.6)
mgr = EpochManager(config)

for i in range(20):
    # Your agent produces a result, you score it
    best_score = evaluate_agent()
    strict_score = evaluate_strict(agent_result)
    loose_score = evaluate_loose(agent_result)

    mgr.record_iteration_result(i, best_score, strict_score, loose_score)

    if mgr.is_epoch_boundary(i):
        transition = mgr.evaluate_epoch_boundary(i)
        if transition.reason != TransitionReason.NO_TRANSITION:
            print(f"⚠️  Epoch {mgr.epoch_index}: {transition.reason.name}")
            print(f"   Tolerances: {mgr.current_tolerances} → {transition.new_tolerances}")
            # Apply the new evaluator criteria
            update_evaluator(transition.new_tolerances)
        mgr.advance_epoch(transition)

rqgm demo — epoch boundaries, hack ratio detection, tolerance tightening
Demo: agent improves → starts gaming → evaluator tightens → adversarial scoring penalises gaming

🎯 Where to Use It

RQGM is a general-purpose primitive for any self-improvement loop. Here are real applications:

AI Agent Loops

# Detect when your agent is gaming the evaluator
mgr = EpochManager(EpochConfig(epoch_size=10))
for walk in agent_walks:
    mgr.record_iteration_result(i, quality_score, strict_score, loose_score)
    if mgr.is_epoch_boundary(i):
        transition = mgr.evaluate_epoch_boundary(i)
        if transition.reason == TransitionReason.EXPLOITATION_DETECTED:
            tighten_evaluation_criteria()  # Agent is gaming you

RLHF / Preference Learning

# Prevent reward model overfitting
dist = ScoreDistribution(scores_at_strict=human_preferences, scores_at_loose=model_scores)
new_tols, log = evolve_tolerances(current_tolerances, dist, params, 0.6, 0.02, 0.02)
# new_tols drops the loosest criterion → reward model gets harder

Benchmark Evaluation

# Detect benchmark overfitting
score = adversarial_score(
    question="What is 2+2?",
    predicted="approximately 4",  # gaming answer
    ground_truth="4",
    current_tolerances=[0.0, 0.1],
    adversarial_pool=gaming_examples,
)
# Returns 0.7 instead of 1.0 — penalised for gaming

CI/CD Quality Gates

# Evolve test pass thresholds based on historical exploitation
if transition.reason == TransitionReason.STAGNATION:
    raise_quality_bar()  # Tests haven't caught a bug in N cycles

📦 Components

Module	Class / Function	Purpose
`epoch.py`	`EpochManager`	Tracks iterations, detects boundaries, triggers transitions
`epoch.py`	`EpochConfig`	Configuration: epoch size, thresholds, mutation params
`epoch.py`	`EpochTransition`	What the runner should do after a boundary
`epoch.py`	`AdversarialExample`	A gaming example (high loose score, low strict score)
`evolution.py`	`evolve_tolerances()`	Pure function: given scores, returns new tolerance schedule
`evolution.py`	`adversarial_score()`	Scorer that penalises answers resembling gaming patterns
`evolution.py`	`ScoreDistribution`	Stats over a set of per-answer scores
`evolution.py`	`UtilityEvolution`	Applies mutations to evaluator config at boundaries

📊 Tested

35 unit tests, all passing. Covers:

Epoch boundary detection
Tolerance tightening on exploitation
Tolerance relaxation on genuine improvement
Adversarial pool collection
Checkpoint serialisation round-trip
evolve_tolerances() pure function
adversarial_score() penalty computation
ScoreDistribution.get_gaming_indices()

python3 -m tests.test_rqgm

🔧 Installation

pip install rqgm

Or from source:

git clone https://github.com/observeco/rqgm-core
cd rqgm-core
pip install -e .

Dependencies: Zero. Python stdlib only. No PyTorch, no transformers, no numpy.

📚 Reference

The Red Queen Gödel Machine: Co-Evolving Agents and Their Evaluators — Iacob et al., University of Cambridge, June 2026
EvoSkill-RQGM — Full integration with EvoSkill's self-improvement loop (first open RQGM implementation)

📖 Citation

If you use rqgm in your research, please cite the original paper:

@article{iacob2026redqueen,
      title={The Red Queen G{\"o}del Machine: Co-Evolving Agents and Their Evaluators},
      author={Iacob, Alex and Jovanovi{\'c}, Andrej and Shen, William F. and
              Burkhardt, Daniel and Kurmanji, Meghdad and Tastan, Nurbek and
              Sani, Lorenzo and Venanzi, Niccol{\`o} Alberto Elia and
              Odonnat, Ambroise and Cao, Zeyu and Marino, Bill and
              Qiu, Xinchi and Lane, Nicholas D.},
      year={2026},
      eprint={2606.26294},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}

📄 License

Apache 2.0 — free for commercial and research use.

Built by ObserveCo
Self-healing observability for AI agents.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.1.0

Jun 29, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rqgm-0.1.0.tar.gz (20.5 kB view details)

Uploaded Jun 29, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

rqgm-0.1.0-py3-none-any.whl (16.3 kB view details)

Uploaded Jun 29, 2026 Python 3

File details

Details for the file rqgm-0.1.0.tar.gz.

File metadata

Download URL: rqgm-0.1.0.tar.gz
Upload date: Jun 29, 2026
Size: 20.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for rqgm-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`590c9cc4cb0afe482b6330dcbaf84d83e99b3fb755e081b924925d9b52fbfebb`
MD5	`ec732a9ada8329106cb0da48a4682763`
BLAKE2b-256	`cea94d8665db4e669e5841cca1398f3b1ad9e253144f0deabfdf7988d0d600a4`

See more details on using hashes here.

File details

Details for the file rqgm-0.1.0-py3-none-any.whl.

File metadata

Download URL: rqgm-0.1.0-py3-none-any.whl
Upload date: Jun 29, 2026
Size: 16.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for rqgm-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`dff0692fcce19267fa03256fd35665f355435a0def0d1eb02561abf57befeafe`
MD5	`4d2a222c920deeb9f683c716d025964b`
BLAKE2b-256	`0927231ac597e50e65a274bf7ece650cb927ec8712c8824b10d0a0d8612fdcd9`

See more details on using hashes here.

rqgm 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

🧬 rqgm — Red Queen Gödel Machine

Co-Evolving Evaluators for Self-Improving AI Systems

🚨 The Problem: Every Self-Improving Agent Eventually Cheats

🧬 The Solution: RQGM

How It Works

Key Results from the Paper

⚡ Quick Start

🎯 Where to Use It

AI Agent Loops

RLHF / Preference Learning

Benchmark Evaluation

CI/CD Quality Gates

📦 Components

📊 Tested

🔧 Installation

📚 Reference

📖 Citation

📄 License

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes