Skip to main content

A testing tool for exploring congestion and fault propagation in asynchronous systems

Project description

Agent Chaos Framework

A testing tool for exploring congestion and fault propagation in asynchronous queue-based systems.

Modern distributed backends and AI agent swarms frequently experience highly variable load spikes. This framework explores how single-host asynchronous architectures behave as workload generation approaches or exceeds queue processing capacity.

🧱 Architectural Topology

[Workload Generator (Agents)] │ ▼ [Async Queue] <─── (Backpressure Feedback Loop) │ ▼ [Worker Pool] │ ▼ [Fault Injector] │ ▼ [Event Stream] │ ▼ [Telemetry Engine] │ ▼ [Collapse Detector] ─── (Triggers Throttle) │ ▼ [Final Experiment Report]

📁 Repository Structure

  • core/: Manages event types and controls asynchronous execution loops via asyncio.
  • fault_model/: Houses pseudo-random fault injection routines simulating service degradation.
  • observability/: Processes streaming metrics to evaluate p95 latency anomalies and compute queue derivatives.
  • experiments/: Script runner that executes reproducible scenarios based on JSON profiles.

📐 Conceptual Stability Model

The framework models execution boundaries using traditional fluid queue dynamics:

$$\frac{d(\text{Queue})}{dt} = \lambda(t) - \mu(t)$$

Where $\lambda(t)$ represents the incoming traffic arrival rate from independent agents, and $\mu(t)$ represents the worker processing rate.

The background detector flags an unstable state when the monitored parameters meet specific limits:

  1. The queue growth derivative $d(\text{Queue})/dt$ stays continuously positive as buffer saturation passes $85%$.
  2. The rolling $p95$ latency calculations cross a $1.0\times$ ($100%$) growth interval within a single monitoring window.

📊 Recorded Validation Benchmarks

The platform was evaluated across three baseline profiles using fixed random seeds to ensure exact reproducibility across multiple runs.

Scenario Profile Simulated Agents Base Workers Bounded Buffer Size Observed Avg Latency Observed p95 Latency Error Rate Resolved System State
EXP-001 (Control) 80 12 200 0.012s 0.015s 0.00% STABLE
EXP-002 (Congestion) 400 4 80 0.145s 0.284s 14.20% DEGRADED
EXP-003 (Saturation) 1000 2 50 0.412s 0.892s 42.65% COLLAPSED

Analysis Summary: The recorded test runs suggest that application degradation is non-linear. When queue saturation passes a critical threshold, the feedback loop between queuing latency and pressure-driven faults accelerates structural collapse.

🛡️ Assumptions & Limitations

  • Single-Host Context: The runtime maps operations on a single host event loop; it does not emulate network splits, clock drift, or distributed RPC consensus across independent hardware.
  • In-Memory Buffering: Tasks traverse memory boundaries using an asyncio.Queue block, bypassing physical device serialization or hardware socket layer stacks.
  • Workload Modeling: Ingestion behaviors assume cooperative client workloads generating tasks independently rather than modeling adversarial network conditions.

🚀 Future Work

  • OpenTelemetry Exporting: Integrating telemetry exporters to pipeline structured tracking events to external monitoring platforms.
  • Network Emulation: Introducing artificial network jitter, socket timeout drops, and connection pool exhaustion states.

📚 References

  • Python asyncio Concurrency Documentation
  • Fluid Flow Models & Basic Queueing Theory Relationships
  • Google Site Reliability Engineering Handbook (Congestion Mitigation Patterns)

📄 License

This project is licensed under the MIT License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agent_chaos_framework-1.0.0.tar.gz (8.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

agent_chaos_framework-1.0.0-py3-none-any.whl (9.7 kB view details)

Uploaded Python 3

File details

Details for the file agent_chaos_framework-1.0.0.tar.gz.

File metadata

  • Download URL: agent_chaos_framework-1.0.0.tar.gz
  • Upload date:
  • Size: 8.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for agent_chaos_framework-1.0.0.tar.gz
Algorithm Hash digest
SHA256 9f8a9635a4871f6b56c32de328e1cdf2b068e11cea8bd0da824335cfd73049dc
MD5 5f1053295094f302ccc8469572050534
BLAKE2b-256 02e882247a809d97aa4df7363fe5bfabe7f86a31f873cc78d8608e8d56dba330

See more details on using hashes here.

File details

Details for the file agent_chaos_framework-1.0.0-py3-none-any.whl.

File metadata

File hashes

Hashes for agent_chaos_framework-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 f69ca4688b546a5dce01c2c2ccb0527188c3de201bad5d60c0f0497694fd89b4
MD5 edac2e7adc713be1970bb115c54e9e25
BLAKE2b-256 fc0129ef6c0b236c377ccea1960495aa4ad97dd16ae4fff71c348e880a754542

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page