Adaptive Runtime Layer for Stateful AI Systems
Project description
Adaptive Runtime
Runtime Intelligence Layer for Stateful AI Systems
Not a chatbot framework. Not an LLM wrapper. Not a workflow builder.
An adaptive runtime intelligence layer โ the missing piece between your AI logic and production reality.
The Problem
Most AI frameworks solve the model problem.
Nobody solves the runtime problem.
Your AI agent in development: Works perfectly.
Your AI agent in production: Crashes. Forgets state. Retries blindly. Dies silently.
Production AI systems fail because of:
- ๐ฅ No crash recovery โ state lost on restart
- ๐ง No memory โ agent forgets context between sessions
- ๐ Retry chaos โ blind retries with no back-off
- ๐ No confidence scoring โ decisions made without certainty
- ๐ No contextual awareness โ can't adapt to changing conditions
Adaptive Runtime fixes this.
See It Running
[16:08:13][RUNTIME] Event received: service_overload
[16:08:13][CONTEXT_ENGINE] risk=high stability=low pressure=0.65
[16:08:13][CONFIDENCE_ENGINE] confidence=0.84
[16:08:13][DECISION_ENGINE] ACTION: RESTART_SERVICE
[16:08:13][STATE_ENGINE] State persisted
[16:08:13][RECOVERY_ENGINE] Checkpoint #3 created
โ restart_service [high] conf=0.840
[16:08:14][RUNTIME] Event received: anomaly_detected
[16:08:14][CONTEXT_ENGINE] risk=low stability=stable pressure=0.32
[16:08:14][CONFIDENCE_ENGINE] confidence=0.62
[16:08:14][DECISION_ENGINE] ACTION: FLAG_FOR_REVIEW
[16:08:14][STATE_ENGINE] State persisted
โ flag_for_review [low] conf=0.620
The runtime thinks, decides, remembers, and recovers โ automatically.
How It Works
Event (CPU spike, anomaly, timeout, auth failure...)
โ
โผ
โโโโโโโโโโโโโโโโโโโ
โ Context Engine โ โ Analyzes conditions: risk, stability, pressure score
โโโโโโโโโโฌโโโโโโโโโ
โ
โผ
โโโโโโโโโโโโโโโโโโโโโโโโ
โ Confidence Engine โ โ Calculates adaptive confidence (with decay + history)
โโโโโโโโโโฌโโโโโโโโโโโโโโ
โ
โผ
โโโโโโโโโโโโโโโโโโโโ
โ Decision Engine โ โ Selects action: restart / throttle / rollback / recover...
โโโโโโโโโโฌโโโโโโโโโโ
โ
โผ
โโโโโโโโโโโโโโโโโโโโ
โ State Engine โ โ Persists state to SQLite (survives crashes)
โโโโโโโโโโฌโโโโโโโโโโ
โ
โผ
โโโโโโโโโโโโโโโโโโโโโโโโ
โ Recovery Engine โ โ Creates checkpoint, handles retry with back-off
โโโโโโโโโโโโโโโโโโโโโโโโ
Quick Start
pip install adaptive-runtime
import asyncio
from adaptive_runtime import Runtime
async def main():
runtime = Runtime(agent_id="my-agent")
await runtime.start()
result = await runtime.process({
"type": "service_overload",
"severity": 0.82,
"cpu": 94,
"memory": 88,
})
print(result.action) # "restart_service"
print(result.confidence) # 0.7831
print(result.reason) # "high_resource_pressure"
print(result.priority) # "high"
await runtime.stop()
asyncio.run(main())
That's it. No API keys. No cloud setup. No GPU. Runs on a $5 VPS.
Killer Example: Adaptive Monitoring System
import asyncio
from adaptive_runtime import Runtime
async def monitor():
runtime = Runtime(agent_id="prod-monitor", checkpoint_every=5)
# Subscribe to critical events
@runtime.bus.subscribe("anomaly_detected")
async def on_anomaly(event):
print(f" โ Anomaly handler fired โ severity={event['severity']}")
await runtime.start()
# Simulate real production events
events = [
{"type": "service_overload", "severity": 0.91, "cpu": 96, "memory": 92},
{"type": "anomaly_detected", "severity": 0.74, "error_rate": 0.6},
{"type": "auth_failure", "severity": 0.55},
{"type": "timeout", "severity": 0.45, "latency_ms": 4200},
{"type": "recovery_needed", "severity": 0.30},
]
for event in events:
result = await runtime.process(event)
print(f" [{result.priority.upper()}] {event['type']:25s} โ {result.action}")
# Runtime remembers everything
history = await runtime.event_history(limit=5)
print(f"\n Last {len(history)} events remembered across sessions.")
await runtime.stop()
asyncio.run(monitor())
Output:
[HIGH] service_overload โ scale_up_immediate
[NORMAL] anomaly_detected โ flag_for_review
โ Anomaly handler fired โ severity=0.74
[NORMAL] auth_failure โ trigger_security_audit
[LOW] timeout โ cache_warmup
[LOW] recovery_needed โ run_recovery
Last 5 events remembered across sessions.
Why Not LangChain?
This question will come up. Here's the honest answer:
| LangChain / AutoGen | Adaptive Runtime | |
|---|---|---|
| Purpose | LLM orchestration | Runtime behavior |
| Core abstraction | Prompt chains | Stateful events |
| Intelligence | Language model | Probabilistic engine |
| Dependencies | Heavy (openai, tiktoken, ...) | Minimal (pydantic, aiosqlite) |
| GPU required | Sometimes | Never |
| Crash recovery | โ | โ Built-in |
| State persistence | External setup | โ Built-in SQLite |
| Confidence scoring | โ | โ Adaptive |
| Runs on $5 VPS | Barely | โ Designed for it |
| Use case | Chat, RAG, agents | Runtime resilience |
TL;DR: LangChain makes LLMs useful. Adaptive Runtime makes AI systems reliable.
They solve different problems. Use both, or use this standalone.
Runtime Philosophy
Most AI problems in production are not model problems.
They are runtime problems.
Adaptive Runtime is built around the belief that future AI systems need:
- Memory โ state that survives crashes and restarts
- Resilience โ self-healing with checkpoints and retry logic
- Contextual behavior โ decisions that adapt to real conditions
- Confidence awareness โ knowing how certain a decision is
- Lightweight cognition โ intelligence without neural dependency
Not just prompts. Not just workflows. Runtime intelligence.
The 5 Core Engines
1. State Engine
Persistent agent memory. Survives crashes. SQLite by default.
await state_engine.save_state({"health": "ok", "version": "1.2"})
state = await state_engine.load_state() # Restored after restart
await state_engine.patch_state({"last": "ok"}) # Partial update
2. Context Engine
Transforms raw signals into contextual understanding โ no ML needed.
ctx = context_engine.analyze({
"type": "service_overload", "cpu": 94, "memory": 88, "severity": 0.82
})
# โ risk="high", stability="low", context="resource_pressure", pressure=0.65
3. Confidence Engine
Adaptive probabilistic scoring with historical weighting and decay.
conf = confidence_engine.calculate(event, context_risk="high")
# โ conf.final = 0.7831 (lower when risk is high, adapts from history)
confidence_engine.record_outcome(success=True, confidence=0.78, context_risk="high")
4. Decision Engine
Explainable rule-based action selection. Extensible with custom rules.
decision = decision_engine.decide(event, "resource_pressure", "high", 0.78)
# โ action="restart_service", reason="high_resource_pressure", priority="high"
# Add your own rules:
custom_rules = [("my_context", "high", 0.70, "my_action", "my_reason")]
engine = DecisionEngine(custom_rules=custom_rules)
5. Recovery Engine
Crash recovery, checkpoint snapshots, exponential back-off retry.
await recovery_engine.create_checkpoint(state) # Save checkpoint
state = await recovery_engine.restore_latest() # Restore after crash
result = await recovery_engine.retry(fn, fallback=fallback_fn) # Retry with back-off
Designed for Constrained Environments
โ
Raspberry Pi
โ
$5 VPS (512MB RAM)
โ
Old laptop
โ
Edge devices
โ
Offline / air-gapped systems
โ
Serverless (cold start friendly)
No GPU. No cloud lock-in. No heavy ML frameworks.
Just Python + asyncio + SQLite.
Project Structure
adaptive_runtime/
โ
โโโ core/
โ โโโ state_engine.py # State persistence and memory
โ โโโ context_engine.py # Event โ contextual classification
โ โโโ confidence_engine.py # Adaptive probabilistic confidence
โ โโโ decision_engine.py # Rule-based action selection
โ โโโ recovery_engine.py # Crash recovery + retry orchestration
โ
โโโ runtime/
โ โโโ runtime_manager.py # Main orchestrator (Runtime class)
โ โโโ event_bus.py # Async pub/sub event bus
โ โโโ cache.py # TTL-based in-memory cache
โ
โโโ storage/
โ โโโ sqlite_store.py # Async SQLite persistence
โ โโโ memory_store.py # In-process ephemeral store (testing)
โ
โโโ observability/
โ โโโ logger.py # Structured color logger
โ โโโ metrics.py # Lightweight in-memory metrics
โ
โโโ examples/
โ โโโ agent_demo.py # Basic event processing
โ โโโ monitoring_demo.py # Continuous monitoring + event bus
โ โโโ automation_demo.py # Retry + crash recovery
โ
โโโ tests/
โโโ test_engines.py # 12 unit tests โ all engines
Run the Examples
# Clone
git clone https://github.com/stateflow-dev/adaptive-runtime.git
cd adaptive-runtime
# Install
pip install pydantic aiosqlite
# Run demos
python examples/agent_demo.py
python examples/monitoring_demo.py
python examples/automation_demo.py
# Run tests
pip install pytest pytest-asyncio
pytest tests/ -v
# โ 12 passed
Roadmap
| Feature | Status | |
|---|---|---|
| โ | 5 Core Engines | Tier 1 โ Released |
| โ | SQLite + Memory store | Tier 1 โ Released |
| โ | Async event bus | Tier 1 โ Released |
| โ | Retry + crash recovery | Tier 1 โ Released |
| ๐ | REST API adapter (FastAPI) | Tier 2 |
| ๐ | Multi-agent orchestration | Tier 2 |
| ๐ | Plugin system | Tier 2 |
| ๐ | Real-time dashboard | Tier 2 |
| ๐ | Distributed runtime | Tier 3 |
Benchmarks
Measured on a mid-range Windows laptop (Python 3.10, SQLite, no GPU).
| Metric | Result |
|---|---|
| Cold start | 446 ms |
| Idle memory | 29 MB |
| CPU idle usage | <0% |
| SQLite save latency | 36.5 ms avg (n=50) |
| SQLite load latency | 2.7 ms avg (n=50) |
| Event processing | 109.2 ms avg (n=50) |
| GPU required | โ Never |
Runs comfortably on a $5 VPS (512MB RAM). No GPU. No cloud lock-in.
Contributing
Issues and PRs welcome. Please open an issue first for major changes.
License
MIT ยฉ Stateflow Labs
"The biggest AI problems in production are not model problems.
They are runtime problems."
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file adaptive_runtime-0.1.2.tar.gz.
File metadata
- Download URL: adaptive_runtime-0.1.2.tar.gz
- Upload date:
- Size: 24.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
922e8394af7a081f3ae6b97420c9170ee2926866ad45b331fcc06bcba24a9755
|
|
| MD5 |
ade3802f3c8d58bbed21a71bf7f84750
|
|
| BLAKE2b-256 |
d8da25d087f2f918aee74b4fa921a76acab3fbb05f0a670cda3eae3168a12a88
|
File details
Details for the file adaptive_runtime-0.1.2-py3-none-any.whl.
File metadata
- Download URL: adaptive_runtime-0.1.2-py3-none-any.whl
- Upload date:
- Size: 23.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5c32cd262504d92cef2015940260816c4a1f0b13faa2ec7122ba0e0b9058f7e8
|
|
| MD5 |
577bbdb2d2c4440b7e8fcec975c5a60d
|
|
| BLAKE2b-256 |
1309f44a3111371151f0a1abedc6b094e1c8bbc5ecc37fbec40ec4971c7486aa
|