Self-healing agent middleware that intercepts errors and injects recovery prompts
Project description
ErrorRecovery-Engine
Self-healing agent middleware that intercepts agent errors and injects recovery prompts. Drop-in, works with any agent.
How It Works
Agent Error → Classify → Pattern Match → Inject Recovery Prompt → Retry → Verify
↓ (no match)
LLM Fallback Prompt
- Classify the error into one of 9 categories (bash, edit, read, write, test, network, import, type, unknown)
- Match against a database of recovery patterns using FAISS semantic search
- Inject the recovery prompt into the agent's next execution
- Verify the recovery succeeded; retry with exponential backoff if needed
Installation
pip install error-recovery
With server support:
pip install error-recovery[server]
From source:
cd error-recovery
pip install -e ".[dev]"
Quick Start
As Middleware (Drop-in)
from error_recovery import ErrorRecovery
class MyAgent:
def run(self, prompt: str, **kwargs) -> str:
# Your agent logic here
return "result"
agent = MyAgent()
# Use as context manager
with ErrorRecovery(agent) as mw:
result = agent.run("do something risky")
# Or create middleware directly
from error_recovery import ErrorRecoveryMiddleware
mw = ErrorRecoveryMiddleware(agent)
wrapped_tool = mw.wrap_tool_call(my_function, tool_name="my_tool")
result = wrapped_tool(args)
As Engine (Programmatic)
from error_recovery import ErrorRecoveryEngine, ErrorRecoveryConfig
config = ErrorRecoveryConfig(max_attempts=3, similarity_threshold=0.8)
engine = ErrorRecoveryEngine(config=config)
result = engine.recover_sync(
error_message="bash: command not found: xyz",
context="Running shell command",
tool_name="bash",
)
print(f"Category: {result.error_category.value}")
print(f"Recovery: {result.recovery_prompt}")
print(f"Success: {result.success}")
As CLI
# Test recovery on an error message
error-recovery recover --error "command not found: xyz" --tool bash
# Analyze error patterns in a trace file
error-recovery analyze trace.jsonl
# Build a FAISS index from pattern data
error-recovery build-index --data ./patterns/ --output ./my_index/
# Start the recovery API server
error-recovery serve --port 8000
As API Server
pip install error-recovery[server]
error-recovery serve --port 8000
# Recover from an error
curl -X POST http://localhost:8000/recover \
-H "Content-Type: application/json" \
-d '{"error_message": "command not found: xyz", "tool_name": "bash"}'
# Health check
curl http://localhost:8000/health
# Stats
curl http://localhost:8000/stats
Error Categories
| Category | Description | Examples |
|---|---|---|
BASH_ERROR |
Shell/command errors | command not found, permission denied, exit code ≠ 0 |
EDIT_ERROR |
File editing errors | pattern not matched, multiple matches |
READ_ERROR |
File read errors | file not found, permission denied |
WRITE_ERROR |
File write errors | permission denied, disk full |
TEST_ERROR |
Test failures | assertion failed, timeout, fixture not found |
NETWORK_ERROR |
Network errors | connection refused, DNS failure, SSL errors |
IMPORT_ERROR |
Import/dependency errors | ModuleNotFoundError, circular import |
TYPE_ERROR |
Type/value errors | TypeError, KeyError, AttributeError |
UNKNOWN |
Unclassified errors | - |
Configuration
from error_recovery import ErrorRecoveryConfig
config = ErrorRecoveryConfig(
max_attempts=3, # Max recovery attempts
similarity_threshold=0.8, # Minimum similarity for pattern match
fallback_to_llm=True, # Use LLM fallback when no pattern matches
backoff_base=2.0, # Exponential backoff base
backoff_max=30.0, # Max backoff seconds
cooling_period_seconds=0.0, # Cooldown between retries
model_name="all-MiniLM-L6-v2", # Sentence transformer model
top_k=5, # Number of pattern matches to return
track_success_rates=True, # Track pattern success rates
)
Pattern Matching
The engine uses sentence-transformers for semantic similarity and FAISS for fast nearest-neighbor search. Each pattern has:
error_type— Category (e.g.,bash_error)pattern— Regex pattern for fast exact matchingerror_message— Description for semantic matchingrecovery_prompt— The prompt to inject for recoverysuccess_rate— Historical success rate (0.0–1.0)tags— Searchable tags
Built-in Patterns
- bash_errors.json — 53 common bash error patterns
- edit_errors.json — 15 file editing patterns
- test_errors.json — 15 test failure patterns
- general.json — 25 network, import, type, and general patterns
Custom Patterns
Add your own patterns as JSON files in the patterns directory:
[
{
"error_type": "bash_error",
"error_message": "terraform apply failed",
"pattern": "terraform.*apply.*failed",
"recovery_prompt": "Terraform apply failed. Check: (1) run 'terraform plan' first, (2) check for state drift, (3) verify provider credentials.",
"success_rate": 0.70,
"tags": ["terraform", "iac"]
}
]
Load them:
from error_recovery import PatternMatcher
matcher = PatternMatcher()
matcher.load_patterns("/path/to/custom_patterns/")
matcher.build_index()
Callbacks
def on_recovery(result: RecoveryResult):
print(f"Recovered: {result.original_error} → {result.recovery_prompt[:50]}")
def on_failure(error: str):
print(f"Failed to recover: {error}")
def on_success(tool_name: str):
print(f"Tool {tool_name} succeeded after recovery")
mw = ErrorRecoveryMiddleware(
on_recovery=on_recovery,
on_failure=on_failure,
on_success=on_success,
)
Architecture
error_recovery/
├── models.py # Pydantic models (ErrorPattern, RecoveryResult, Config)
├── error_classifier.py # Regex + keyword error classification
├── pattern_matcher.py # FAISS + sentence-transformers semantic matching
├── recovery_engine.py # Core recovery logic (classify → match → inject → verify)
├── middleware.py # Drop-in agent middleware wrapper
├── cli.py # CLI: recover, analyze, build-index, serve
└── patterns/ # Built-in recovery patterns
├── bash_errors.json # 53 patterns
├── edit_errors.json # 15 patterns
├── test_errors.json # 15 patterns
└── general.json # 25 patterns
Running Tests
pip install -e ".[dev]"
pytest tests/ -v
License
MIT
Ecosystem
Part of the FableForge ecosystem — 21 open-source projects built from 210K real agent traces:
| Project | Description |
|---|---|
| Anvil | Self-verified coding agent |
| VerifyLoop | Plan→Execute→Verify→Recover framework |
| ErrorRecovery | Self-healing middleware (3,725 error patterns) |
| FableForge-14B | The fine-tuned 14B model (4-stage training) |
| ShellWhisperer | 1.5B edge agent (phone/RPi, 50ms) |
| ReasonCritic | Verification model (130 benchmark tasks) |
| TraceCompiler | Compile traces → LoRA skills |
| AgentRuntime | Persistent agent daemon (systemd for AI) |
| AgentSwarm | Multi-agent from real trace transitions |
| AgentTelemetry | Datadog for agents (token tracking, costs) |
| BenchAgent | HumanEval for tool-use (107 tasks) |
| AgentDev | VSCode extension with verification |
| TraceViz | Trace replay visualizer (Next.js) |
| AgentSkills | npm for agent behaviors |
| AgentCurriculum | 5-stage progressive training |
| AgentFuzzer | Adversarial testing for agents |
| AgentConstitution | Safety guardrails from traces |
| CostOptimizer | Token cost reduction (50-80%) |
| AgentProfiler | Behavioral fingerprinting |
| TrajectoryDistiller | Trace→training data pipeline |
| Fable5-Dataset | HuggingFace dataset release |
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file error_recovery-0.1.0.tar.gz.
File metadata
- Download URL: error_recovery-0.1.0.tar.gz
- Upload date:
- Size: 32.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
dc7cd9adc715d1406703f9d2e52544ecafa1c398ada64938c25ee94d0f025134
|
|
| MD5 |
b7c5d4f02184e91df02463aaea5b4435
|
|
| BLAKE2b-256 |
d57b1f07524fff3b6dc91e278c8efb48026400a802d1569c0af78a7758b9336a
|
File details
Details for the file error_recovery-0.1.0-py3-none-any.whl.
File metadata
- Download URL: error_recovery-0.1.0-py3-none-any.whl
- Upload date:
- Size: 35.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
11d5cb3b3d6a34c3eb8e2e88b1613cd07ef24a161c13bb58ed7124199b691827
|
|
| MD5 |
edcedcc6cc8af20c95c1178dde3fb5b7
|
|
| BLAKE2b-256 |
636ea2f33510e37f87fcb727438d7e76235577b33be27cea2448f114745ed38c
|