YAML-driven safety evaluation pipeline with LiteLLM-backed stages

These details have not been verified by PyPI

Project description

ASSERT.

Adaptive Spec-driven Scoring for Evaluation and Regression Testing
Local-first. Framework-agnostic. Trace-aware.

🚀 Get started | 🔌 View supported targets | 📘 CLI Reference | 🧪 Examples

Diagram of the ASSERT evaluation framework

Why ASSERT?

Most AI systems start with a specification: product requirements, policies, system prompts, or launch criteria describing what the system should and should not do.

But evaluation often starts elsewhere: generic scorers, predefined benchmarks, or manual test cases that drift from the original intent.

ASSERT closes that gap. It turns your specified behaviors in natural language into structured, executable evaluations that can be reviewed, run, scored, and improved over time.

From the natural language specification, the ASSERT pipeline derives behavior categories, generates single-turn and multi-turn test cases, inferences them against your target, and uses an LLM judge to score each conversation against your policies.

What you get with ASSERT

Spec-driven coverage - test cases are generated from your product requirements and context, not a generic benchmark. You specify the behaviors that you want to test for
Test any model endpoint via integrations with LiteLLM, supporting 100+ model endpoints from platform providers such as Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sagemaker, HuggingFace, VLLM, NVIDIA NIM.
Test any agent or multi-agent system via integrations with OpenInference. Evaluate a LangGraph agent, a CrewAI / OpenAI Agents SDK / DSPy / LlamaIndex / AutoGen system, custom multi-agent orchestration, a Python callable, or a hosted model — without rewriting the evaluation orchestration pipeline.
Agent trace-grounded judgment - the recommended integration captures OpenTelemetry spans (Phoenix/OpenInference auto-instruments 33+ frameworks in two lines, or you can emit your own with the OTel SDK) so the judge can cite tool calls, routing, model calls, and latency as evidence — not just the final response.
Portable artifacts - every stage writes JSON/JSONL files locally for inspection, CI, and sharing.
Bundled local viewer - browse runs side-by-side, pin a baseline, drill into per-behavior dimension breakdowns, and read judge justifications cited against the captured traces.

Get started

Quick install

pip install -e ".[otel,langgraph]"       # install
cp .env.example .env                     # add your provider key
assert-ai run --config examples/travel_planner_langgraph/eval_config.yaml

🌐 Project website ↗	📝 Technical blog ↗	🚀 Quickstart guide ↗	📚 Documentation ↗
Learn about ASSERT	Read the Command Line post	Follow the full walkthrough	Browse concepts and guides

Trademarks

This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow Microsoft's Trademark & Brand Guidelines. Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos is subject to those third party's policies.

Telemetry

This project does not collect or send telemetry to Microsoft by default. Runs write local artifacts under artifacts/results/, and optional OpenTelemetry trace capture is controlled by your configuration and local collector setup, such as Phoenix.

If you configure a target, judge, trace collector, or model provider to send data to an external service, the prompts, responses, traces, metadata, and other evaluation artifacts sent to that service are governed by that service's terms and your configuration.

Disclaimer: Risks and limitations of ASSERT

See the full section in the Concept Doc.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.1.0

Jun 2, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

assert_ai-0.1.0.tar.gz (421.7 kB view details)

Uploaded Jun 2, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

assert_ai-0.1.0-py3-none-any.whl (310.3 kB view details)

Uploaded Jun 2, 2026 Python 3

File details

Details for the file assert_ai-0.1.0.tar.gz.

File metadata

Download URL: assert_ai-0.1.0.tar.gz
Upload date: Jun 2, 2026
Size: 421.7 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for assert_ai-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`8590ff355cec603f7abd14e46f64bc67d3d9f3604a30d63696511d5fc47fb762`
MD5	`4a78e2083b7bd3aaf79c413c5674677c`
BLAKE2b-256	`e34ad63298cc73b45c132aa9ae13c4748bb891400e677f12c78e27dec5a71ca1`

See more details on using hashes here.

Provenance

The following attestation bundles were made for assert_ai-0.1.0.tar.gz:

Publisher: publish-pypi.yml on responsibleai/ASSERT

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: assert_ai-0.1.0.tar.gz
- Subject digest: 8590ff355cec603f7abd14e46f64bc67d3d9f3604a30d63696511d5fc47fb762
- Sigstore transparency entry: 1704796029
- Sigstore integration time: Jun 2, 2026
Source repository:
- Permalink: responsibleai/ASSERT@78d35e066f88dc9effb77ef80df4aa860ac342e9
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/responsibleai
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish-pypi.yml@78d35e066f88dc9effb77ef80df4aa860ac342e9
- Trigger Event: release

File details

Details for the file assert_ai-0.1.0-py3-none-any.whl.

File metadata

Download URL: assert_ai-0.1.0-py3-none-any.whl
Upload date: Jun 2, 2026
Size: 310.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for assert_ai-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`922ec0d4134a033262919f73db09075bfe46a7e4179ac2678aec9ce4a6fa2c0f`
MD5	`c8500f1d95fe2d0b06002630adbcc55a`
BLAKE2b-256	`24f478922b2c23a1d15dfbdfeeb739cce7e4210dc9c6e1a730a1ecdb11bae191`

See more details on using hashes here.

Provenance

The following attestation bundles were made for assert_ai-0.1.0-py3-none-any.whl:

Publisher: publish-pypi.yml on responsibleai/ASSERT

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: assert_ai-0.1.0-py3-none-any.whl
- Subject digest: 922ec0d4134a033262919f73db09075bfe46a7e4179ac2678aec9ce4a6fa2c0f
- Sigstore transparency entry: 1704796071
- Sigstore integration time: Jun 2, 2026
Source repository:
- Permalink: responsibleai/ASSERT@78d35e066f88dc9effb77ef80df4aa860ac342e9
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/responsibleai
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish-pypi.yml@78d35e066f88dc9effb77ef80df4aa860ac342e9
- Trigger Event: release

assert-ai 0.1.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

ASSERT.

Why ASSERT?

What you get with ASSERT

Get started

Quick install

Trademarks

Telemetry

Disclaimer: Risks and limitations of ASSERT

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance