Skip to main content

Agent Stability Engine: reproducible stability testing primitives for autonomous agents.

Project description

Agent Stability Engine (ASE)

Week 1-12 implementation for ASE core metrics, mutation stress testing, arbitration, contradiction analysis, taxonomy severity scoring, drift tracking, long-horizon stability, self-healing remediation, benchmark runner, regression gating, release/demo packaging, and CLI.

Quickstart

python3.11 -m venv .venv
source .venv/bin/activate
python -m pip install --upgrade pip
python -m pip install -e .[dev]
python -m pytest
python -m ruff check .
python -m black --check .
python -m mypy src

CLI

export OPENAI_API_KEY="..."

python -m agent_stability_engine.cli evaluate --prompt "Explain checksums" --run-count 5 --seed 42 --asi-profile reasoning_focus --mutation-limit 6 --output out/eval.json --manifest-output out/eval.manifest.json
python -m agent_stability_engine.cli evaluate --agent-provider openai --agent-model gpt-4o-mini --prompt "Explain checksums" --run-count 3 --seed 42 --output out/eval-openai.json
python -m agent_stability_engine.cli benchmark --suite examples/benchmarks/default_suite.json --run-count 5 --seed 42 --asi-profile safety_strict --mutation-limit 6 --output out/bench.json --manifest-output out/bench.manifest.json
python -m agent_stability_engine.cli regress --suite examples/benchmarks/reasoning_suite.json --baseline examples/baselines/reasoning_suite.baseline.json --run-count 3 --seed 42 --output out/regress-reasoning.json
python -m agent_stability_engine.cli drift --current-report out/eval.json --baseline-report out/baseline_eval.json --output out/drift.json
python -m agent_stability_engine.cli horizon --prompt "Plan migration strategy" --horizon 6 --run-count 5 --seed 42 --output out/horizon.json
python -m agent_stability_engine.cli heal --prompt "Provide triage steps" --run-count 5 --seed 42 --max-attempts 2 --output out/heal.json --manifest-output out/heal.manifest.json
python -m agent_stability_engine.cli demo --output-dir out/demo --run-count 3 --seed 42 --horizon 4 --manifest-output out/demo.manifest.json

GitHub Action (Regression Gate)

Use the bundled action to gate PRs on benchmark ASI regressions.

name: ASE Regression Gate

on:
  pull_request:

jobs:
  ase-regress:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: ruthwikdovala/Stabilium@main
        with:
          suite: examples/benchmarks/reasoning_suite.json
          baseline: examples/baselines/reasoning_suite.baseline.json
          run-count: "3"
          seed: "42"

For OpenAI-backed runs, add:

      - uses: ruthwikdovala/Stabilium@main
        with:
          suite: examples/benchmarks/reasoning_suite.json
          baseline: examples/baselines/reasoning_suite.baseline.json
          agent-provider: openai
          openai-api-key: ${{ secrets.OPENAI_API_KEY }}

Build, Publish, Install (Python 3.11 Standard)

Use one interpreter path for the whole release sequence.

python3.11 -m venv .venv
source .venv/bin/activate
python -m pip install --upgrade pip build twine
python -m build
python -m twine upload dist/*

Install verification in a clean environment:

python3.11 -m venv .venv-install-check
source .venv-install-check/bin/activate
python -m pip install --upgrade pip
python -m pip install -U agent-stability-engine
ase --help
python -c "import agent_stability_engine; print(agent_stability_engine.__version__)"

Release Docs

  • docs/BUILD_PUBLISH_INSTALL.md
  • docs/RELEASE_CHECKLIST.md
  • docs/DEMO_RUNBOOK.md

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agent_stability_engine-0.1.1.tar.gz (38.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

agent_stability_engine-0.1.1-py3-none-any.whl (39.1 kB view details)

Uploaded Python 3

File details

Details for the file agent_stability_engine-0.1.1.tar.gz.

File metadata

  • Download URL: agent_stability_engine-0.1.1.tar.gz
  • Upload date:
  • Size: 38.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for agent_stability_engine-0.1.1.tar.gz
Algorithm Hash digest
SHA256 ec87b4d958cc1e87704afc98a9e00218befb1f90b482f86c987f5afd5cd51937
MD5 be1dfed7482419e23346cdd0789e3619
BLAKE2b-256 711adf444c5ccfe4c3d46d2f2820c4127f7e90ee93eb8ae981de60af57e7f3db

See more details on using hashes here.

File details

Details for the file agent_stability_engine-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for agent_stability_engine-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 8e0f02b5939fb1d92e54e323a7928bac99dc0f7837dddacae7a035cb3f105327
MD5 7e8afb1829433f327844cefde1fd21ed
BLAKE2b-256 559400821ce0d19d1fb9a70a3f70de4511f6b0ea25c22c16eb72194fc8c14123

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page