Testbench to evaluate agents using Ragas
Project description
Testbench
Kubernetes-native agent evaluation system that executes test datasets via the A2A protocol, scores responses with pluggable metrics (RAGAS by default), and publishes scores via OpenTelemetry.
📖 Documentation: https://docs.agentic-layer.ai/testbench/
Run standalone
For evaluating an agent without deploying into Kubernetes / Testkube:
pip install agentic-layer-testbench
testworkflow config.yaml
See config.example.yaml for the available configuration options.
Development
Prerequisites
- Python
- uv
- Tilt and a local Kubernetes cluster (e.g. kind)
- Testkube CLI
GOOGLE_API_KEYfor LLM-as-a-judge evaluation via Gemini
Build and run locally
# Install Python dependencies
uv sync
# Provide the LLM-as-a-judge API key
echo "GOOGLE_API_KEY=<key>" > .env
# Start the local stack (AI gateway, OTLP collector, sample agents, Testkube)
tilt up
Test
uv run poe ruff # format and lint
uv run poe mypy # static type checking
uv run poe bandit # security scanning
uv run poe test # unit tests
uv run poe check # all of the above
uv run poe test_e2e # E2E tests (requires `tilt up`)
E2E defaults target the Tilt environment. Override with E2E_DATASET_URL, E2E_AGENT_URL, E2E_MODEL if needed.
Verify the local deploy
Run the example workflow against the sample weather agent:
kubectl testkube run tw example-workflow --watch
The full walkthrough — defining experiments, configuring metrics, viewing reports — is in the first-workflow how-to.
Contributing
See the Contribution Guide.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file agentic_layer_testbench-0.9.2.tar.gz.
File metadata
- Download URL: agentic_layer_testbench-0.9.2.tar.gz
- Upload date:
- Size: 38.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b4bb33582106f13801baa6b3e9c1939d2d91c7d86ee0afd21dcdc8cfab350864
|
|
| MD5 |
dc31f9ea1a7f7edef772776e01969085
|
|
| BLAKE2b-256 |
9a63c9a01d8d38fbfc6051db6810686b4be8b590149e3dbc881c07f3a578d78c
|
Provenance
The following attestation bundles were made for agentic_layer_testbench-0.9.2.tar.gz:
Publisher:
publish.yml on agentic-layer/testbench
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
agentic_layer_testbench-0.9.2.tar.gz -
Subject digest:
b4bb33582106f13801baa6b3e9c1939d2d91c7d86ee0afd21dcdc8cfab350864 - Sigstore transparency entry: 1579437407
- Sigstore integration time:
-
Permalink:
agentic-layer/testbench@ed0f526e76e192817747ca5f9bb0c0ba87b18bb4 -
Branch / Tag:
refs/tags/v0.9.2 - Owner: https://github.com/agentic-layer
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@ed0f526e76e192817747ca5f9bb0c0ba87b18bb4 -
Trigger Event:
push
-
Statement type:
File details
Details for the file agentic_layer_testbench-0.9.2-py3-none-any.whl.
File metadata
- Download URL: agentic_layer_testbench-0.9.2-py3-none-any.whl
- Upload date:
- Size: 49.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2566503ab1bafc1f90e2782e11a73c91b8a6344a5a2e2f0be8388ffce393ff6a
|
|
| MD5 |
c9246a89879377a79729e4b9a809ca88
|
|
| BLAKE2b-256 |
3d67132f90f25dff91749f720f2c81e3168b58a7ebd425eca9e9dac1be5e5892
|
Provenance
The following attestation bundles were made for agentic_layer_testbench-0.9.2-py3-none-any.whl:
Publisher:
publish.yml on agentic-layer/testbench
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
agentic_layer_testbench-0.9.2-py3-none-any.whl -
Subject digest:
2566503ab1bafc1f90e2782e11a73c91b8a6344a5a2e2f0be8388ffce393ff6a - Sigstore transparency entry: 1579437541
- Sigstore integration time:
-
Permalink:
agentic-layer/testbench@ed0f526e76e192817747ca5f9bb0c0ba87b18bb4 -
Branch / Tag:
refs/tags/v0.9.2 - Owner: https://github.com/agentic-layer
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@ed0f526e76e192817747ca5f9bb0c0ba87b18bb4 -
Trigger Event:
push
-
Statement type: