Adaptive Immune Runtime for Agent Systems — population-level failure prevention for AI agents

These details have not been verified by PyPI

Project links

Project description

AIRAS — Adaptive Immune Runtime for Agent Systems

A runtime layer that observes failure patterns across populations of AI agent executions, learns abstract failure signatures, and preemptively prevents those failures from recurring.

The Problem

AI agents in production fail in predictable, repeating patterns:

5 agents × 90% reliability = 59% system reliability (multiplicative collapse)
88% of enterprise agent pilots never reach production — reliability is the #1 blocker
$340K average cost of a failed agent project
Root causes appear at step 3, effects show at step 7 — no tool traces this causally
Current observability tools tell you what went wrong. Nothing prevents it from happening again.

The Solution

AIRAS learns from agent failures at the population level and prevents them preemptively:

Observe — Collect execution traces from all agent runs
Extract — Find where failed traces diverge from successful ones
Abstract — Generalize specific failures into matchable patterns
Intervene — Generate targeted prompt-level fixes for each pattern class
Test — Validate interventions via replay before deploying
Evolve — Continuously improve interventions via LLM-generated mutations
Prevent — Match new executions against known patterns BEFORE failure manifests

Validated Results

Tested on 250 real SWE-bench agent trajectories:

Metric	Result	Target
Prevention Rate	52%	≥50% ✅
False Positive Rate	0%	<5% ✅
Pattern Coverage	96%	—
Patterns Discovered	7	≥5 ✅
Matching Latency	15.5ms	<10ms ✅ (with Qdrant)

v2 target: 85% prevention rate via LLM judge + self-improving interventions + cross-domain transfer.

How It Works

1. Agent executes step 7 of a task
2. SDK sends partial trace (steps 1-7) to AIRAS
3. AIRAS extracts behavioral signals (errors, loops, step ratio)
4. Searches vector index for matching failure patterns (<5ms)
5. Contextual bandit selects the best intervention variant for this context
6. SDK injects intervention into agent's next prompt
7. Agent catches the error BEFORE making it
8. Outcome feeds back into efficacy tracking → interventions improve over time

Architecture

┌─────────────┐     ┌──────────────────────────────────┐     ┌─────────┐
│ Agent (SDK) │────▶│         AIRAS API                 │────▶│ Qdrant  │
└─────────────┘     │                                   │     └─────────┘
                    │  /v1/check  → Matching Engine      │
                    │  /v1/traces → Queue for extraction │────▶│ Redis   │
                    │  /v2/predict → Predictive Immunity │     └─────────┘
                    └──────────────┬────────────────────┘
                                   │
                    ┌──────────────▼──────────────────────┐
                    │          WORKERS (async)             │
                    │                                      │
                    │  Extractor: traces → antigens        │───▶│Postgres│
                    │  Judge: LLM evaluates hard cases     │    └────────┘
                    │  Evolution: mutates interventions    │
                    └─────────────────────────────────────┘

Quick Start

Install

uv add airas-sdk

Or with pip:

pip install airas-sdk

For the full server stack (Qdrant, Redis, Postgres):

uv add "airas-sdk[server]"

Run the Killer Experiment (no infrastructure needed)

git clone https://github.com/yash1511-bogam/airas.git
cd airas
uv venv .venv && source .venv/bin/activate
uv pip install -e ".[experiment]"
python -m airas.experiment.runner

Deploy Production System

docker compose up -d
python -m airas.scripts.seed_antigens
curl http://localhost:8100/health

SDK Integration (3 lines)

from airas.sdk import AIRASClient, airas_middleware

client = AIRASClient(base_url="http://localhost:8100")
graph = airas_middleware(my_langgraph_app, client)  # done

API Endpoints

Endpoint	Method	Purpose	Latency
`/v1/check`	POST	Real-time immunity check	<50ms
`/v1/traces`	POST	Ingest completed trace	<100ms
`/v1/antigens`	GET	List discovered patterns	<200ms
`/v1/stats`	GET	Dashboard metrics	<200ms
`/v2/predict`	POST	Predict failures from task description	<100ms
`/v2/domains`	GET	List domain adapters	<50ms
`/v2/evolution/stats`	GET	Intervention evolution metrics	<100ms
`/health`	GET	Liveness	<10ms

Project Structure

airas/
├── docker-compose.yml
├── Dockerfile
├── pyproject.toml
├── README.md
└── airas/
    ├── models.py                       # Core Pydantic models (experiment)
    ├── models/
    │   ├── domain.py                   # Production domain models
    │   └── api.py                      # Request/response schemas
    ├── api/
    │   ├── main.py                     # FastAPI app + v1 routes
    │   └── routes_v2.py                # v2 routes (predict, domains, evolution)
    ├── core/
    │   ├── matching.py                 # Qdrant-backed real-time matching
    │   ├── tolerance.py                # Over-correction prevention
    │   ├── judge.py                    # Hybrid heuristic + LLM judge
    │   ├── bandit.py                   # Contextual Thompson Sampling
    │   └── predictor.py                # Task → failure class prediction
    ├── storage/
    │   ├── qdrant.py                   # Vector DB layer
    │   └── redis_store.py              # Cache + streams
    ├── extraction/                     # Pattern extraction pipeline
    │   ├── alignment.py                # Structural divergence detection
    │   ├── classifier.py               # Rule-based failure classification
    │   ├── abstraction.py              # Pattern abstraction
    │   └── clustering.py               # HDBSCAN clustering
    ├── intervention/
    │   └── templates.py                # Intervention template bank
    ├── replay/
    │   └── engine.py                   # Heuristic effectiveness judge
    ├── evolution/
    │   └── __init__.py                 # LLM mutation engine + evolution lifecycle
    ├── domains/
    │   ├── __init__.py                 # Universal adapter framework (4 domains)
    │   └── templates.py                # Domain-specific intervention templates
    ├── experiment/
    │   └── runner.py                   # Killer experiment orchestrator
    ├── sdk/
    │   ├── __init__.py                 # Package exports
    │   ├── client.py                   # Async HTTP client (graceful degradation)
    │   └── langgraph.py                # LangGraph middleware
    ├── workers/
    │   ├── main.py                     # Trace extraction worker
    │   └── evolution_worker.py         # Intervention mutation + promotion worker
    └── scripts/
        └── seed_antigens.py            # Load validated patterns into Qdrant

Tech Stack

Component	Technology	Why
API	FastAPI	Async, <50ms p99
Vector DB	Qdrant	HNSW <5ms at 100K vectors, payload filtering
State	PostgreSQL	Audit trail, efficacy tracking
Cache/Queue	Redis	Hot cache, stream-based async processing
Embeddings	all-MiniLM-L6-v2	384-dim, local, free, fast
Clustering	HDBSCAN	Finds natural clusters without specifying K
LLM (judge/mutations)	DeepSeek V4-Flash	$0.14/M input — entire learning loop costs $1.47/month
SDK	httpx (async)	Non-blocking, 500ms timeout, graceful degradation

Key Design Decisions

No LLM calls in the hot path — real-time matching is deterministic (<50ms)
LLM only in async learning loop — judge + mutations run in background workers
Graceful degradation — if AIRAS is down, agents run normally (SDK never blocks)
Two-phase matching — embedding similarity (Qdrant) + condition predicates (code)
Behavioral signals only — runtime matching uses errors, loops, step anomalies (not outcome)
Contextual bandits — different intervention variants for different execution contexts
Self-improving — interventions evolve via LLM-generated mutations + A/B testing
Cross-domain — patterns learned in coding agents transfer to support/research/data agents
Tolerance layer — max 3 interventions per trace, cooldown, auto-disable low-efficacy
Population-level learning — every failure makes the system smarter for ALL users

Supported Domains

Domain	Adapter	Example Tools
Coding	`CodingAdapter`	edit, search, run_test, submit
Customer Support	`SupportAdapter`	search_kb, draft_reply, escalate, close_ticket
Research	`ResearchAdapter`	web_search, read_paper, synthesize, verify_claim
Data Pipeline	`DataPipelineAdapter`	query_db, transform, validate, write_table

Adding a new domain: implement DomainAdapter.map_action() (maps domain tools → 6 universal types). All existing patterns transfer immediately.

The Moat

The failure pattern database is the product. Every deployment adds patterns. More users → more patterns → better prevention → more users. The self-improvement loop means interventions get better every day without human effort.

License

MIT License

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.2.1

May 11, 2026

This version

0.2.0

May 11, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

airas_sdk-0.2.0.tar.gz (44.4 kB view details)

Uploaded May 11, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

airas_sdk-0.2.0-py3-none-any.whl (51.2 kB view details)

Uploaded May 11, 2026 Python 3

File details

Details for the file airas_sdk-0.2.0.tar.gz.

File metadata

Download URL: airas_sdk-0.2.0.tar.gz
Upload date: May 11, 2026
Size: 44.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.11.13 {"installer":{"name":"uv","version":"0.11.13","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for airas_sdk-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`7a209474d2a090b3b088d625804686c4e821e0bb7bed6aae96c3076cb532f728`
MD5	`5ebd7e97db0e63009241023a74bd5c14`
BLAKE2b-256	`e18200e0f8c3fcd6ab2ae2061fd31b0dd8cb11d96d9441b3a5e4c13b05f11485`

See more details on using hashes here.

File details

Details for the file airas_sdk-0.2.0-py3-none-any.whl.

File metadata

Download URL: airas_sdk-0.2.0-py3-none-any.whl
Upload date: May 11, 2026
Size: 51.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.11.13 {"installer":{"name":"uv","version":"0.11.13","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for airas_sdk-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`39a9e308088c7b71570fbe9bacf712b94e54f4caed247ec3fd003e2ed99daf0f`
MD5	`6fc3e0b87cc0f59c4567660dea53fdb0`
BLAKE2b-256	`95b71c95a1344b104e40e10346fbebcefedb8afbee3f54851565f716da43378e`

See more details on using hashes here.

airas-sdk 0.2.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

AIRAS — Adaptive Immune Runtime for Agent Systems

The Problem

The Solution

Validated Results

How It Works

Architecture

Quick Start

Install

Run the Killer Experiment (no infrastructure needed)

Deploy Production System

SDK Integration (3 lines)

API Endpoints

Project Structure

Tech Stack

Key Design Decisions

Supported Domains

The Moat

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes