Skip to main content

A safety-first self-improvement loop for AI agents: execute → track → analyze → auto-apply → auto-rollback on regression.

Project description

self-improving-loop

A safety-first self-improvement loop for AI agents: execute → track → analyze → auto-apply → auto-rollback on regression.

PyPI License: MIT Python Overhead

Most "self-improving agent" projects stop at "log the failures, let the next run read the log". That's a methodology, not a loop. This package is the loop, as 1,170 lines of pure-stdlib Python — no framework lock-in, no LLM dependency, no cloud.

Wrap any function, get:

  • 📊 Automatic execution tracking (success rate, latency, rolling window)
  • 🧠 Adaptive thresholds per agent profile (high-freq / mid-freq / low-freq / critical)
  • 🛠 Auto-apply improvement configs when failure pattern detected
  • 🛡 Auto-rollback when the new config regresses (>10% success drop, >20% latency gain, or 5 consecutive failures)
  • 📬 Pluggable notifier (stub by default — swap in Telegram / Slack / whatever)

Extracted from TaijiOS, where it survived a 346-heartbeat Ising physics experiment and production-scale agent workloads.


Install

pip install self-improving-loop

Zero required dependencies. Everything is datetime, json, pathlib, typing.


30-second example

from self_improving_loop import SelfImprovingLoop

loop = SelfImprovingLoop()

def my_agent_work():
    # Your actual agent call / LLM chain / tool invocation
    return {"status": "ok", "data": ...}

result = loop.execute_with_improvement(
    agent_id="my-agent",
    task="handle user query",
    execute_fn=my_agent_work,
)

if result["improvement_triggered"]:
    print(f"Applied {result['improvement_applied']} config tweaks")

if result["rollback_executed"]:
    print(f"Rolled back because: {result['rollback_executed']['reason']}")

That's it. The loop silently watches every execution, decides when to tune, and undoes tunings that made things worse.


Why this exists

Most agents have this failure mode:

  1. You ship an agent.
  2. It works for a week.
  3. Something upstream changes (rate limits, schema drift, a new edge case).
  4. Your agent starts failing.
  5. You find out three days later from angry users.
  6. You tweak a config, hope for the best, ship it.
  7. The tweak makes another scenario worse.
  8. You roll it back manually, losing the original learning.

self-improving-loop turns steps 3–8 into a tight feedback loop that runs inside your process, without needing observability infra, Kubernetes, or a dedicated ML team.


Adaptive thresholds (no magic numbers)

Different agents have different "pulse rates". A critical alerting agent should reconsider after 1 failure; a batch classifier can tolerate 5 before triggering analysis. The library classifies agents by execution frequency and adjusts:

Agent profile Failure trigger Analysis window Cooldown
High-frequency (>100/day) 5 failures 48h 3h
Medium-frequency (10-100/day) 3 failures 24h 6h
Low-frequency (<10/day) 2 failures 72h 12h
Critical (user-marked) 1 failure 24h 6h

Or bypass the classifier and set manually:

from self_improving_loop import AdaptiveThreshold

adaptive = AdaptiveThreshold()
adaptive.set_manual_threshold(
    "critical-agent",
    failure_threshold=1,
    analysis_window_hours=12,
    cooldown_hours=1,
    is_critical=True,
)

Auto-rollback (the safety net)

When a config change ships, the loop keeps watching. It rolls back if any of these become true:

  • Success rate drops >10%
  • Average latency increases >20%
  • ≥5 consecutive failures after the change
# See recent rollbacks
rollback_history = loop.auto_rollback.get_rollback_history("my-agent")
for event in rollback_history:
    print(event["reason"], event["timestamp"])

Pluggable notifier

The built-in TelegramNotifier is a stub — it logs to stdout. Override _send_message() to hook any channel:

from self_improving_loop import TelegramNotifier

class MySlackNotifier(TelegramNotifier):
    def __init__(self, webhook_url, **kw):
        super().__init__(**kw)
        self.webhook_url = webhook_url

    def _send_message(self, message, priority="normal"):
        import requests
        requests.post(self.webhook_url, json={"text": f"[{priority}] {message}"})

loop = SelfImprovingLoop(notifier=MySlackNotifier(webhook_url="https://hooks..."))

Performance

Measured locally with benchmarks/overhead.py (200 iterations per workload, Python 3.12, Windows):

Workload profile Absolute overhead Relative overhead
~100 ms agent call (typical LLM) +0.27 ms +0.3%
~10 ms agent call (tool call) +0.31 ms +3.0%
sub-millisecond call +0.08 ms >>% (don't wrap these)

The wrapper adds a stable ~300 μs of fixed cost per call (trace append + threshold check). Whether that's negligible depends on your workload:

  • LLM calls (>500 ms): overhead is ≤0.06% — invisible
  • HTTP / DB calls (~30-100 ms): ≤1%
  • Fast in-memory work (<10 ms): 3%+ — reconsider whether you need this for those

Rerun the benchmark on your own machine with python benchmarks/overhead.py.

Separate operation costs (triggered occasionally, not per-call):

Operation Cost
Failure analysis (only when threshold crossed) ~100 ms
Applying improvement config ~200 ms
Rollback execution ~10 ms

Not a...

  • ...methodology doc. Many "self-improving agent" repos are markdown templates that ask you to log learnings to CLAUDE.md. This is the runtime loop that does it for you.
  • ...heavyweight framework. 1,170 LoC of stdlib. Drop it next to your existing code. No decorators forced on you. No background process.
  • ...LLM-dependent. The analysis is statistical, not LLM-based. If you want LLM-authored config tweaks, subclass SelfImprovingLoop._analyze_failure() and ask your favorite LLM there.

Background

Extracted from TaijiOS — a self-learning AI operating system with 5 I Ching–bound engines and a 346-heartbeat Ising physics experiment. The parent project has 14 modules; this one is the most generally reusable, so it lives as a standalone package.

The author is a non-CS-background former-entrepreneur who built TaijiOS via multi-AI collaboration starting on Chinese New Year 2026-02-17 (60 days before this release).


License

MIT. Ship it wherever.

Contact / Feedback

This is a very early release. Every bug report, every "didn't work for me", every "I wish it did X" is read:

  • Email (preferred): yangfei222666@gmail.com
  • WeChat (secondary): yf529486
  • GitHub Issue: open one
  • Parent project: TaijiOS

"Safety first, then automation."

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

self_improving_loop-0.1.0.tar.gz (21.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

self_improving_loop-0.1.0-py3-none-any.whl (18.9 kB view details)

Uploaded Python 3

File details

Details for the file self_improving_loop-0.1.0.tar.gz.

File metadata

  • Download URL: self_improving_loop-0.1.0.tar.gz
  • Upload date:
  • Size: 21.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for self_improving_loop-0.1.0.tar.gz
Algorithm Hash digest
SHA256 8fc217198ee10799ded3803cbfcb4867ede750019f1e05f100d33d42b7dccf39
MD5 1ab5193088d30284144973aca63eb538
BLAKE2b-256 ef96ee2b809bdc6ff1f58bc2e2e7df349f6c0a1ba35782263e99df59444b630a

See more details on using hashes here.

File details

Details for the file self_improving_loop-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for self_improving_loop-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 3d0d34d93ee8a02b9916051b9915b9a04e28524076bce42630a672cb1e191583
MD5 777a3883a86e64b70e8e59652208c868
BLAKE2b-256 90b814493cf1e3f0d4064447026c6db5cd6d7ab65ed2b30d7b0260535e92d83a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page