Skip to main content

Runtime audit CLI for detecting asyncio degradation and threadpool saturation.

Project description

Async Runtime Auditor

A lightweight CI/CD runtime audit CLI for Python asyncio applications.

Standard APM metrics (like P99 HTTP latency) frequently hide severe async runtime degradation. Synchronous blocking calls and executor queue amplification can stall the event loop while external HTTP metrics still appear healthy.

Async Runtime Auditor is a heuristic-driven operational tool designed to run in staging environments and deployment pipelines. It queries a telemetry backend (such as Prometheus), evaluates runtime state against configurable thresholds, and fails the build before blocking code reaches production.


The Problem: Telemetry Asymmetry

In many production Python systems:

healthy HTTP latency != healthy runtime state

Standard infrastructure metrics frequently fail to detect:

  • synchronous I/O executed on the main async thread
  • threadpool saturation and executor queue wait times
  • asynchronous scheduler starvation
  • hidden queue amplification

This tool exposes these blind spots by evaluating event-loop and executor telemetry directly.


Installation

Requires Python 3.9+

pip install async-runtime-auditor

For local development:

pip install -e .

from the repository root.


Quick Start

1. Initialize Configuration

Generate a local metrics.yaml configuration file:

async-auditor --init

2. Run an Audit

Run the tool against a staging or local telemetry backend:

async-auditor --target http://localhost:9090

3. CI/CD Pipeline Gating

Run with strict exit semantics to fail the pipeline if critical degradation is detected:

async-auditor --target http://localhost:9090 --fail-on-critical

The CLI exits with code 1 if runtime state is:

  • CRITICAL
  • COLLAPSE_RISK

Configuration (metrics.yaml)

The auditor is intentionally decoupled from specific telemetry naming conventions.

Prometheus queries are mapped into the heuristic engine through metrics.yaml.

target: "http://localhost:9090"

queries:
  p99_latency: "histogram_quantile(0.99, rate(http_request_duration_seconds_bucket[5m]))"
  event_loop_lag: "max_over_time(asyncio_event_loop_lag_seconds[15m])"
  blocking_events_total: "blocking_events_total"
  blocking_duration_avg: "rate(blocking_duration_seconds_sum[15m]) / rate(blocking_duration_seconds_count[15m])"
  peak_active_requests: "max_over_time(active_requests[15m])"
  threadpool_queue_wait: "rate(threadpool_queue_wait_seconds_sum[15m]) / rate(threadpool_queue_wait_seconds_count[15m])"
  threadpool_tasks_started: "threadpool_tasks_started_total"
  threadpool_tasks_completed: "threadpool_tasks_completed_total"

thresholds:
  health_score_collapse: 2500
  health_score_critical: 1200
  health_score_degraded: 400
  max_event_loop_lag_s: 0.5
  max_blocking_duration_s: 1.0
  max_queue_wait_s: 1.0
  max_threadpool_backlog: 3
  max_active_requests: 150

Output Formats

The CLI supports:

  • human-readable terminal output
  • structured JSON output for CI pipelines

Standard Text Output

async-auditor --output-format text

JSON Output

async-auditor --output-format json

A structured report is also written to:

audit_results.json

inside the current working directory.


CI/CD Integration Example

A reference GitHub Actions workflow is included in:

examples/github-actions-gating.yml

This demonstrates how to use the auditor as a deployment quality gate inside pull request pipelines.


Operational Model

The auditor computes a deterministic runtime health score using:

  • event-loop lag
  • blocking duration
  • queue wait amplification
  • active request pressure
  • threadpool backlog behavior

The scoring model is intentionally threshold-driven and explainable.

This project does not use:

  • machine learning classification
  • anomaly detection systems
  • probabilistic runtime forecasting

Intended Usage

This tool is designed primarily for:

  • deployment pipeline gating
  • staging environment validation
  • runtime regression detection
  • async infrastructure diagnostics

It is not intended to replace full observability platforms.


Scope

This project is intentionally narrow in scope.

Included

  • heuristic-based async runtime failure classification
  • CI/CD exit-code semantics
  • configurable operational thresholds
  • Prometheus-compatible telemetry querying
  • runtime degradation detection

Not Included

  • distributed tracing infrastructure
  • observability data storage systems
  • automated remediation
  • Kubernetes orchestration
  • OpenTelemetry collectors
  • AI-driven diagnosis systems

Design Constraints

The project intentionally prioritizes:

  • operational clarity
  • deterministic output
  • explainable runtime scoring
  • low dependency overhead
  • simple deployment integration

over:

  • infrastructure breadth
  • platform extensibility
  • distributed orchestration
  • autonomous remediation

License

MIT License

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

async_runtime_auditor-0.1.0.tar.gz (9.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

async_runtime_auditor-0.1.0-py3-none-any.whl (8.8 kB view details)

Uploaded Python 3

File details

Details for the file async_runtime_auditor-0.1.0.tar.gz.

File metadata

  • Download URL: async_runtime_auditor-0.1.0.tar.gz
  • Upload date:
  • Size: 9.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for async_runtime_auditor-0.1.0.tar.gz
Algorithm Hash digest
SHA256 49af559cf2f221672060ba8736b7268bf881eaa1967fa713c3e9c9e3aa5cd0f9
MD5 c03cc7703b2b2dfcf804463ae3bc554f
BLAKE2b-256 540cc2b75a136bf1ff567730e070711977fe89b868fbba4ca1acf81b740ec668

See more details on using hashes here.

File details

Details for the file async_runtime_auditor-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for async_runtime_auditor-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 520833ec2df54d7a39821f95c6a64bacadee90ed3bcc7b03f94851b7e18dc2c2
MD5 1d1cdfdac5a4b48bfcec1e9b38a804fe
BLAKE2b-256 f5d091cdf7775d81a0562deaa29e420514fe87f90afd3c50c10c90ccc982b6ef

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page