Agent Stability Engine: reproducible stability testing primitives for autonomous agents.
Project description
Agent Stability Engine (ASE)
Week 1-12 implementation for ASE core metrics, mutation stress testing, arbitration, contradiction analysis, taxonomy severity scoring, drift tracking, long-horizon stability, self-healing remediation, benchmark runner, regression gating, release/demo packaging, and CLI.
Quickstart
python3.11 -m venv .venv
source .venv/bin/activate
python -m pip install --upgrade pip
python -m pip install -e .[dev]
python -m pytest
python -m ruff check .
python -m black --check .
python -m mypy src
CLI
export OPENAI_API_KEY="..."
python -m agent_stability_engine.cli evaluate --prompt "Explain checksums" --run-count 5 --seed 42 --asi-profile reasoning_focus --mutation-limit 6 --output out/eval.json --manifest-output out/eval.manifest.json
python -m agent_stability_engine.cli evaluate --agent-provider openai --agent-model gpt-4o-mini --prompt "Explain checksums" --run-count 3 --seed 42 --output out/eval-openai.json
python -m agent_stability_engine.cli benchmark --suite examples/benchmarks/default_suite.json --run-count 5 --seed 42 --asi-profile safety_strict --mutation-limit 6 --output out/bench.json --manifest-output out/bench.manifest.json
python -m agent_stability_engine.cli regress --suite examples/benchmarks/reasoning_suite.json --baseline examples/baselines/reasoning_suite.baseline.json --run-count 3 --seed 42 --output out/regress-reasoning.json
python -m agent_stability_engine.cli drift --current-report out/eval.json --baseline-report out/baseline_eval.json --output out/drift.json
python -m agent_stability_engine.cli horizon --prompt "Plan migration strategy" --horizon 6 --run-count 5 --seed 42 --output out/horizon.json
python -m agent_stability_engine.cli heal --prompt "Provide triage steps" --run-count 5 --seed 42 --max-attempts 2 --output out/heal.json --manifest-output out/heal.manifest.json
python -m agent_stability_engine.cli demo --output-dir out/demo --run-count 3 --seed 42 --horizon 4 --manifest-output out/demo.manifest.json
GitHub Action (Regression Gate)
Use the bundled action to gate PRs on benchmark ASI regressions.
name: ASE Regression Gate
on:
pull_request:
jobs:
ase-regress:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: ruthwikdovala/Stabilium@main
with:
suite: examples/benchmarks/reasoning_suite.json
baseline: examples/baselines/reasoning_suite.baseline.json
run-count: "3"
seed: "42"
For OpenAI-backed runs, add:
- uses: ruthwikdovala/Stabilium@main
with:
suite: examples/benchmarks/reasoning_suite.json
baseline: examples/baselines/reasoning_suite.baseline.json
agent-provider: openai
openai-api-key: ${{ secrets.OPENAI_API_KEY }}
Build, Publish, Install (Python 3.11 Standard)
Use one interpreter path for the whole release sequence.
python3.11 -m venv .venv
source .venv/bin/activate
python -m pip install --upgrade pip build twine
python -m build
python -m twine upload dist/*
Install verification in a clean environment:
python3.11 -m venv .venv-install-check
source .venv-install-check/bin/activate
python -m pip install --upgrade pip
python -m pip install -U agent-stability-engine
ase --help
python -c "import agent_stability_engine; print(agent_stability_engine.__version__)"
Release Docs
docs/BUILD_PUBLISH_INSTALL.mddocs/RELEASE_CHECKLIST.mddocs/DEMO_RUNBOOK.md
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file agent_stability_engine-0.1.1.tar.gz.
File metadata
- Download URL: agent_stability_engine-0.1.1.tar.gz
- Upload date:
- Size: 38.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ec87b4d958cc1e87704afc98a9e00218befb1f90b482f86c987f5afd5cd51937
|
|
| MD5 |
be1dfed7482419e23346cdd0789e3619
|
|
| BLAKE2b-256 |
711adf444c5ccfe4c3d46d2f2820c4127f7e90ee93eb8ae981de60af57e7f3db
|
File details
Details for the file agent_stability_engine-0.1.1-py3-none-any.whl.
File metadata
- Download URL: agent_stability_engine-0.1.1-py3-none-any.whl
- Upload date:
- Size: 39.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8e0f02b5939fb1d92e54e323a7928bac99dc0f7837dddacae7a035cb3f105327
|
|
| MD5 |
7e8afb1829433f327844cefde1fd21ed
|
|
| BLAKE2b-256 |
559400821ce0d19d1fb9a70a3f70de4511f6b0ea25c22c16eb72194fc8c14123
|