Testbench to evaluate agents using Ragas
Project description
Testbench
Kubernetes-native agent evaluation system that executes test datasets via A2A protocol, evaluates responses using pluggable metrics (RAGAS by default), and publishes scores via OpenTelemetry.
Documentation
Full documentation is available at docs.agentic-layer.ai.
Install
Run the testbench standalone (without Kubernetes / Testkube) on any system:
pip install agentic-layer-testbench
testbench config.yaml
See config.example.yaml for the available configuration options.
Prerequisites
- Python 3.12+ and uv
- Kubernetes cluster (e.g. kind) with Tilt
- Testkube CLI
GOOGLE_API_KEYfor LLM-as-a-judge evaluation via Gemini models
Getting Started
# 1. Start local infrastructure (AI Gateway, OTLP collector, sample agents, Testkube)
# Create a .env file with GOOGLE_API_KEY=your-key first
tilt up
# 2. Run the example evaluation workflow
kubectl testkube run tw example-workflow --watch
See the how-to guide for detailed pipeline usage including dataset format, metric configuration, and custom workflows.
Development
| Command | Description |
|---|---|
uv run poe check |
Run all quality checks (tests, mypy, bandit, ruff) |
uv run poe test |
Unit tests |
uv run poe format |
Format with Ruff |
uv run poe lint |
Lint and auto-fix with Ruff |
uv run poe ruff |
Both format and lint |
uv run poe mypy |
Static type checking |
uv run poe bandit |
Security vulnerability scanning |
E2E Testing
Requires the Tilt environment running (tilt up).
# Configure (optional — defaults target the Tilt environment)
export E2E_DATASET_URL="http://data-server.data-server:8000/dataset.csv"
export E2E_AGENT_URL="http://weather-agent.sample-agents:8000"
export E2E_MODEL="gemini-2.5-flash-lite"
# Run
uv run poe test_e2e
Contributing
See Contribution Guide for details on contributing and the process for submitting pull requests.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file agentic_layer_testbench-0.9.1.tar.gz.
File metadata
- Download URL: agentic_layer_testbench-0.9.1.tar.gz
- Upload date:
- Size: 38.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a42bd68bbcbfc1b965a42a64b1d3e84334d6657a30f2cc7511b9694dae37ba83
|
|
| MD5 |
e2de648286c408125b5f81663bbd171d
|
|
| BLAKE2b-256 |
f2d100cc3532183f0d2eaccff43120ddfdce7afa1a9308cfdfb053b55e338fc4
|
Provenance
The following attestation bundles were made for agentic_layer_testbench-0.9.1.tar.gz:
Publisher:
publish.yml on agentic-layer/testbench
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
agentic_layer_testbench-0.9.1.tar.gz -
Subject digest:
a42bd68bbcbfc1b965a42a64b1d3e84334d6657a30f2cc7511b9694dae37ba83 - Sigstore transparency entry: 1473935828
- Sigstore integration time:
-
Permalink:
agentic-layer/testbench@84cbca3a7fd0682e44d5ad8d8120a56ca3d8d872 -
Branch / Tag:
refs/tags/v0.9.1 - Owner: https://github.com/agentic-layer
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@84cbca3a7fd0682e44d5ad8d8120a56ca3d8d872 -
Trigger Event:
push
-
Statement type:
File details
Details for the file agentic_layer_testbench-0.9.1-py3-none-any.whl.
File metadata
- Download URL: agentic_layer_testbench-0.9.1-py3-none-any.whl
- Upload date:
- Size: 50.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6d0b7836cef6fb56e846b9041f1bf05eb821f36a35bcc4bfa5e6378efe36a918
|
|
| MD5 |
a56d79c9c6d78509b700b3a3d7476f2c
|
|
| BLAKE2b-256 |
36864bb10a0fdd81727aba57667d0dea76e5a4cbe0966ecc89f554bfd9ce01e4
|
Provenance
The following attestation bundles were made for agentic_layer_testbench-0.9.1-py3-none-any.whl:
Publisher:
publish.yml on agentic-layer/testbench
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
agentic_layer_testbench-0.9.1-py3-none-any.whl -
Subject digest:
6d0b7836cef6fb56e846b9041f1bf05eb821f36a35bcc4bfa5e6378efe36a918 - Sigstore transparency entry: 1473935942
- Sigstore integration time:
-
Permalink:
agentic-layer/testbench@84cbca3a7fd0682e44d5ad8d8120a56ca3d8d872 -
Branch / Tag:
refs/tags/v0.9.1 - Owner: https://github.com/agentic-layer
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@84cbca3a7fd0682e44d5ad8d8120a56ca3d8d872 -
Trigger Event:
push
-
Statement type: