Failure memory for AI agents — self-healing retry with structured learning
Project description
ReLoop
Failure memory for AI agents.
Every agent fails. ReLoop is the first framework that gets smarter from failure.
The Problem
AI agents retry blindly -- same mistake, same failure, burning tokens and money. No framework treats failure as data. They either retry with no memory, or give up.
The Solution
ReLoop captures every failure into a structured memory graph -- error type, root cause, suggested fix, confidence score, semantic embedding -- so the next retry starts smarter. Your agents don't just recover. They get permanently smarter.
Quick Start
pip install reloop
reloop init
reloop demo
Three Ways to Use ReLoop
1. As a Library (any agent, 3 lines)
from reloop import FailureMemory
memory = FailureMemory(redis_url="redis://localhost:6379")
similar = await memory.search("ImportError sharp") # Returns past failures + fixes
2. As a Framework (full self-healing loop)
reloop run "Fix and deploy the Next.js project at ./my-broken-app"
3. As an MCP Server (Claude Code / Cursor)
{
"mcpServers": {
"reloop": {
"command": "python",
"args": ["-m", "src.mcp_server"]
}
}
}
How It Works
The REJD Loop: Retrieve -> Execute -> Judge -> Distill
flowchart TD
A([New Task]) --> R[Retrieve\nQuery Redis for similar past failures]
R --> E[Execute\nRun in Blaxel sandbox]
E --> J{Judge\nSuccess or failure?}
J -- Success --> D_OK[Distill Success\nStore solution + learnings]
D_OK --> DONE([Task Complete])
J -- Failure --> D_FAIL[Distill Failure\nCapture root cause, fix, confidence]
D_FAIL --> CB{Circuit breaker\nor budget exceeded?}
CB -- Yes --> ABANDON([Task Abandoned])
CB -- No --> R
Powered by:
- OpenAI Agents SDK -- orchestrates the REJD loop with handoffs between specialist agents
- Redis -- 3-tier failure memory (working, long-term, episodic)
- Blaxel -- Firecracker sandbox with 25ms checkpoint/restore
Timeline UI
The non-chat interface that makes failure learning visible.
A horizontal timeline of colored nodes tells the full story at a glance:
RED (failed) -> RED (failed) -> RED (failed) -> GREEN (succeeded)
Click any node to inspect the full failure record -- root cause, suggested fix, confidence score, cost, and the exact code diff that resolved it.
Integrations
Works with any agent framework:
- OpenAI Agents SDK
- LangGraph
- CrewAI
- Claude Agent SDK
- Raw Python
ReLoop is the memory layer -- bring your own orchestration.
A/B: Memory vs No Memory
| Metric | Without Memory | With Memory |
|---|---|---|
| Attempts to fix 4 bugs | 12+ | 4 |
| Total cost | $0.47 | $0.18 |
| Same mistake repeated | 3x | 0x |
API Reference
Full API reference: docs/api-reference.md
| Method | Path | Description |
|---|---|---|
POST |
/v1/tasks |
Create and run a task |
GET |
/v1/tasks/{id} |
Get task status and result |
GET |
/v1/tasks/{id}/timeline |
Full execution timeline |
GET |
/v1/tasks/{id}/sse |
Server-Sent Events stream |
POST |
/v1/memories/search |
Semantic search over failure memory |
GET |
/v1/memories/stats |
Aggregated memory statistics |
GET |
/v1/tasks/{id}/checkpoints |
List sandbox checkpoints |
POST |
/v1/tasks/{id}/checkpoints/{cid}/restore |
Rewind to checkpoint |
Architecture
+-------------------+ +-------------------+ +-------------------+
| OpenAI Agents | | Redis Agent | | Blaxel |
| SDK | | Memory Server | | Firecracker VMs |
| | | | | |
| Orchestrates the | | 3-tier memory: | | Perpetual state |
| REJD loop with |<--->| - Working | | 25ms resume |
| specialist agent | | - Long-term |<--->| Checkpoint/ |
| handoffs | | - Episodic | | restore |
+-------------------+ +-------------------+ +-------------------+
| | |
v v v
+---------------------------------------------------------------+
| FastAPI + SSE |
| Task management, memory search, checkpoints, streaming |
+---------------------------------------------------------------+
|
v
+---------------------------------------------------------------+
| Next.js Dashboard |
| Timeline UI, failure sidebar, live output, cost tracker |
+---------------------------------------------------------------+
| Layer | Technology | Role |
|---|---|---|
| Orchestration | OpenAI Agents SDK | REJD loop with specialist agent handoffs |
| Failure Memory | Redis Agent Memory Server | 3-tier: working memory, long-term failure graph, episodic traces |
| Execution Sandbox | Blaxel Firecracker microVMs | Perpetual state, 25ms resume, checkpoint/restore |
| API | FastAPI + SSE | Task management, memory search, real-time streaming |
| Dashboard | Next.js + Tailwind + shadcn/ui | Timeline, failure sidebar, cost tracker |
Contributing
We welcome contributions. See CONTRIBUTING.md for:
- Development environment setup
- Code style requirements (ruff, mypy)
- PR process and review checklist
- Architecture overview for new contributors
License
Apache 2.0 -- see LICENSE for the full text.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file reloop_ai-0.1.0.tar.gz.
File metadata
- Download URL: reloop_ai-0.1.0.tar.gz
- Upload date:
- Size: 410.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0a885b8a10cfae1314e7c417be9e8822f53657e241c2a7af9c0aed0b6ffc0b9a
|
|
| MD5 |
57849668958283958bdb07c2e7d3b7d8
|
|
| BLAKE2b-256 |
79dcb83a1ebf30c7224a7afac90002c222e98cc63a23da28c48cdb9285aa73d5
|
File details
Details for the file reloop_ai-0.1.0-py3-none-any.whl.
File metadata
- Download URL: reloop_ai-0.1.0-py3-none-any.whl
- Upload date:
- Size: 86.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0b0827bdfc951a3556bfa04fa65614971e7f8382ef1770c92ee79dc0d7e50054
|
|
| MD5 |
a467bb508355a296f2d14480b5ea88a1
|
|
| BLAKE2b-256 |
52932a8a9e2e315a49c96506691aa33fdb60166392b21b9ab890c7472f495ec6
|