Local-first Accelerated Prompt Stress Testing starter kit

These details have not been verified by PyPI

Project links

Project description

APST Starter Kit

APST stands for Accelerated Prompt Stress Testing. It is a depth-oriented LLM safety and reliability evaluation workflow that repeatedly samples the same prompts under controlled conditions to estimate empirical failure probability under repeated inference.

This v0.1 starter kit is local-first and conference-friendly. You can run the mock demo without an API key, then swap in OpenAI, Together.ai, or an OpenAI-compatible local endpoint when you are ready to test a real model.

Quickstart

Install from PyPI after the first release:

pip install apst-starter-kit
apst init my-apst-demo
cd my-apst-demo

apst run --config configs/demo_mock.yaml
apst report --results outputs/demo_results.csv --lang both

Or run from a Git checkout:

git clone <repo-url>
cd apst-starter-kit
pip install -e .

apst run --config configs/demo_mock.yaml
apst report --results outputs/demo_results.csv --lang both

The demo writes:

outputs/demo_results.csv
outputs/demo_results.json
outputs/demo_results_report_both.md

What The Demo Does

The mock run:

loads a small prompt set from data/prompts/demo_prompts.json
repeatedly samples each prompt at two temperatures
judges every response with the local rule judge mode
computes APST reliability and repeated-use risk metrics
exports CSV and JSON result files
generates English, Chinese, or bilingual Markdown reports

No API key or network access is required for configs/demo_mock.yaml.

Configuring Models

Use the mock provider for local demos:

models:
  - name: mock-apst-model
    model_id: mock-apst-model
    provider: mock
judge_mode: rule

Use OpenAI with an LLM judge:

export OPENAI_API_KEY=...
apst run --config configs/openai_example.yaml

Use an OpenAI-compatible local endpoint, such as Ollama or vLLM exposing /v1/chat/completions. APST uses the same client path for both: model, base_url, and a placeholder api_key. With a local base_url, prompts, outputs, and labels are sent only to the local server you configured.

models:
  - model: llama3.1
    base_url: http://localhost:11434/v1
    api_key: local-not-needed
judge_mode: rule

The OpenAI-compatible client is installed with the base package, so Ollama and vLLM only need the local server running:

apst run --config configs/ollama_local.yaml
apst run --config configs/vllm_local.yaml

For local LLM-as-judge, point judge_model at another OpenAI-compatible local server:

judge_mode: llm
judge_model:
  model: llama3.1
  base_url: http://localhost:11434/v1
  api_key: local-not-needed

See docs/local_model_servers.md for Ollama and vLLM server commands.

Together.ai models are also supported through the existing provider extra:

pip install -e ".[providers]"
export TOGETHER_API_KEY=...
apst run --config configs/openai_example.yaml --models meta-llama/Llama-3.3-70B-Instruct-Turbo

Judge Modes

rule: local deterministic checks for refusal, harmful operational guidance, crisis-support handling, and gibberish. Good for demos, local LLMs, and fast smoke tests.
heuristic: local malformed-output check only. Good when you want a very conservative no-network sanity pass.
llm: LLM-as-judge classification using judge_model. Good for richer audits and real model comparisons.

APST Metrics

Each prompt/model/temperature config reports:

empirical_failure_probability: observed failures divided by repeated samples
reliability: 1 - empirical_failure_probability
apst_risk_at_10: estimated chance of at least one failure across 10 independent attempts
failure_probability_ci_low and failure_probability_ci_high: Wilson interval bounds
failure_mode_distribution: counts by judge label

The repeated-use estimate is intentionally simple:

Risk@N = 1 - (1 - empirical_failure_probability)^N

Reports

apst report --results outputs/demo_results.csv --lang en
apst report --results outputs/demo_results.csv --lang zh
apst report --results outputs/demo_results.csv --lang both

Use --audit-contact to put a real intake link, email address, or conference note in the report. See docs/enterprise_audits.md for the enterprise audit handoff text.

Legacy Commands

The older llm-eval entry point and APST/AIRBench commands are still available:

llm-eval run-apst --config configs/smoke.yaml
llm-eval freeze-airbench --region us --per-l4 5 --output data/prompts/airbench_us_v1.json

Local Checks

pytest

Publishing

See docs/publishing.md for GitHub release and PyPI publishing steps. The package includes an apst init command so PyPI users can scaffold the runnable configs, prompt files, and docs without cloning the repository.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.0

May 15, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

apst_starter_kit-0.1.0.tar.gz (47.8 kB view details)

Uploaded May 15, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

apst_starter_kit-0.1.0-py3-none-any.whl (52.4 kB view details)

Uploaded May 15, 2026 Python 3

File details

Details for the file apst_starter_kit-0.1.0.tar.gz.

File metadata

Download URL: apst_starter_kit-0.1.0.tar.gz
Upload date: May 15, 2026
Size: 47.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.11

File hashes

Hashes for apst_starter_kit-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`804e1123622829df92bf4a7c45e92399ce3f37af6095e4b6d4d72ddd7a1a6e6d`
MD5	`8e38740a2b51f6bde230afc78be6d3a8`
BLAKE2b-256	`6f25eb13af02ba24b5a99f2f43d06dbeb0ef17cd8c49a4d55a0ef9872e12ba16`

See more details on using hashes here.

File details

Details for the file apst_starter_kit-0.1.0-py3-none-any.whl.

File metadata

Download URL: apst_starter_kit-0.1.0-py3-none-any.whl
Upload date: May 15, 2026
Size: 52.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.11

File hashes

Hashes for apst_starter_kit-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`5a92241362c7b090f70bc73c1ae7719f80fea708b6957afa5b0a7a916c641ddf`
MD5	`768bffe064f96ec150aada622ef8ac90`
BLAKE2b-256	`1ab6adf855149aef383fd96c33493f32c4f581c40801d0b4dd4702d8e0fa977a`

See more details on using hashes here.

apst-starter-kit 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

APST Starter Kit

Quickstart

What The Demo Does

Configuring Models

Judge Modes

APST Metrics

Reports

Legacy Commands

Local Checks

Publishing

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes