High-performance evaluation framework for LLM agents
Project description
Assay Python SDK
Record deterministic traces from your Python agents for regression gating.
🚀 Golden Quickstart
The fastest way to regression test your AI agent.
1. Installation
pip install assay
2. Record (record.py)
Run your agent through the SDK to capture a trace. Pass your tool functions to tool_executors so Assay can record their inputs and outputs.
import os
import openai
from assay_sdk import TraceWriter, record_chat_completions_with_tools
# 1. Setup Client & Tools
client = openai.OpenAI(api_key=os.environ.get("OPENAI_API_KEY", "mock"))
TOOLS = [{
"type": "function",
"function": {
"name": "GetWeather",
"parameters": {"type": "object", "properties": {"location": {"type": "string"}}}
}
}]
# 2. Define Execution Logic (The "Real" Code)
def get_weather(args):
return {"temp": 22, "location": args.get("location")}
# 3. Record the Loop
writer = TraceWriter("traces/quickstart.jsonl")
result = record_chat_completions_with_tools(
writer=writer,
client=client,
model="gpt-4o",
messages=[{"role": "user", "content": "Weather in Tokyo?"}],
tools=TOOLS,
tool_executors={"GetWeather": get_weather}, # Link schema -> function
episode_id="weather_demo",
test_id="weather_check"
)
print(f"Agent Final Answer: {result['content']}")
3. Configure (assay.yaml)
Tell Assay what to check.
version: 1
model: "trace"
tests:
- id: weather_check
input:
prompt: "Weather in Tokyo?" # Matches the recorded prompt
expected:
type: regex_match
pattern: ".*" # Pass if any content returned (baseline check)
4. Verify
Run the regression gate. This replays your trace against the recorded tool outputs to ensure determinism.
# Verify strictly (fails if any tool call arg changed even slightly)
assay ci --config assay.yaml --trace-file traces/quickstart.jsonl --replay-strict --db :memory:
🌊 Advanced: Streaming support
Capture streaming responses while maintaining tool call execution.
from assay_sdk import record_chat_completions_stream_with_tools
# ... setup client & writer ...
result = record_chat_completions_stream_with_tools(
writer=writer,
# ... args ...
stream=True # SDK handles chunk aggregation automatically
# tool_executors={...} # Required if tools are used
)
Note: The hybrid wrapper (record_chat_completions_stream_with_tools) streams the thinking tokens to the user, executes tools, and then performs a standard follow-up call.
🛡️ Advanced: Privacy & Redaction
Protect sensitive data (PII, API keys) from ever hitting the trace file.
from assay_sdk import TraceWriter, make_redactor
# Create a redactor that scrubs keys and regex patterns
redactor = make_redactor(
key_denylist={"authorization", "password", "api_key"},
patterns=[r"sk-[a-zA-Z0-9]{20,}"] # Mask OpenAI keys
)
# Attach to writer - happens automatically on write
writer = TraceWriter("traces/secure.jsonl", redact_fn=redactor)
⚡ Async Support
Native async support for high-throughput applications (FastAPI, etc.) is available via the assay_sdk.async_openai submodule. It provides full parity with the sync API, including loop and streaming support.
❓ Troubleshooting
E_TRACE_EPISODE_MISSING
Cause: The test_id or episode_id in your trace doesn't match what assay ci expected from its config (or implicit default).
Fix: Ensure your assay.yaml test IDs match the test_id passed to record_chat_completions....
"Duplicate prompt in strict replay"
Cause: You ran record.py twice without cleaning the trace file, so it contains two identical episodes. assay ci in strict mode doesn't know which one to replay.
Fix:
- Truncate the file before recording:
trace_path = "traces/my_trace.jsonl"; open(trace_path, 'w').close(). - Use unique
episode_ids (e.g. UUIDs) for every run.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file assay_it-0.9.0.tar.gz.
File metadata
- Download URL: assay_it-0.9.0.tar.gz
- Upload date:
- Size: 28.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
879b0792f0530ca9f36517410d4312e6f708be9571d2521f80b58058ccb31c81
|
|
| MD5 |
933698001b71b1b0d8ec8c491791336c
|
|
| BLAKE2b-256 |
cd637147c0d996e473506206f770eef0705abac9ee888fed79f1e1fc09fc32df
|
Provenance
The following attestation bundles were made for assay_it-0.9.0.tar.gz:
Publisher:
publish.yml on Rul1an/assay
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
assay_it-0.9.0.tar.gz -
Subject digest:
879b0792f0530ca9f36517410d4312e6f708be9571d2521f80b58058ccb31c81 - Sigstore transparency entry: 780555698
- Sigstore integration time:
-
Permalink:
Rul1an/assay@2b907841256eedfd786479c926ae5d2e687cdb51 -
Branch / Tag:
refs/tags/v0.9.0 - Owner: https://github.com/Rul1an
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@2b907841256eedfd786479c926ae5d2e687cdb51 -
Trigger Event:
push
-
Statement type:
File details
Details for the file assay_it-0.9.0-py3-none-any.whl.
File metadata
- Download URL: assay_it-0.9.0-py3-none-any.whl
- Upload date:
- Size: 38.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c4cf96909b0a20f0f15c6149708b9c6e69109564ab9024c1c258ab3c2c8bbb85
|
|
| MD5 |
f303e47a203839fa8006433102f6aecc
|
|
| BLAKE2b-256 |
8877948187f403d509d248c54deeef32923a389bc3265a61e426f60e8fba5fff
|
Provenance
The following attestation bundles were made for assay_it-0.9.0-py3-none-any.whl:
Publisher:
publish.yml on Rul1an/assay
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
assay_it-0.9.0-py3-none-any.whl -
Subject digest:
c4cf96909b0a20f0f15c6149708b9c6e69109564ab9024c1c258ab3c2c8bbb85 - Sigstore transparency entry: 780555699
- Sigstore integration time:
-
Permalink:
Rul1an/assay@2b907841256eedfd786479c926ae5d2e687cdb51 -
Branch / Tag:
refs/tags/v0.9.0 - Owner: https://github.com/Rul1an
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@2b907841256eedfd786479c926ae5d2e687cdb51 -
Trigger Event:
push
-
Statement type: