Skip to main content

Helpers for testing agents with Google's adk-python

Project description

pytest-adk

Pytest helpers for evaluating agents built with Google ADK. The package provides:

  • an auto-registered AgentEvaluator pytest fixture that saves ADK eval result JSON files under each test's tmp_path;
  • TOML evalset support, including multi-line prompts;
  • external prompt templates for repeated evalset text, rendered with string.Template by default or optionally with Jinja2;
  • a pytest-adk-eval-schema CLI for generating fill-in evalset templates;
  • helpers for resuming an exported ADK session with an in-memory Runner.

Installation

pip install pytest-adk

For development and tests, install the dev extra:

pip install "pytest-adk[dev]"

Usage

AgentEvaluator is a pytest fixture, auto-registered via the pytest11 entry point — installing pytest-adk makes it available with no import and no conftest.py. Just request it as a test argument:

import pytest


@pytest.mark.asyncio
async def test_home_automation(AgentEvaluator):
    await AgentEvaluator.evaluate(
        agent_module='home_automation_agent',
        eval_dataset_file_path_or_dir=(
            'tests/integration/fixture/home_automation_agent/'
            'simple_test.test.json'
        ),
    )

The fixture binds the eval results directory to pytest's tmp_path, so you no longer pass results_dir yourself. Result JSON files are written under tmp_path/test_app/.adk/eval_history/.

After the run, pytest's terminal summary prints an ADK eval results section listing, for every test that used the fixture, the eval_history directory where its results were saved — shown regardless of whether the test passed or failed, so you can always find them:

=================== ADK eval results ===================
tests/test_home_automation.py::test_home_automation
  /tmp/pytest-of-you/pytest-0/test_home_automation0/test_app/.adk/eval_history

Evalset files: JSON or TOML

AgentEvaluator.evaluate discovers and loads evalset files in two formats:

  • *.test.json — the schema used by google-adk's AgentEvaluator.
  • *.test.toml — the same EvalSet schema, written in TOML.

How eval_dataset_file_path_or_dir is interpreted depends on whether it points at a directory or a single file:

  • Directory: only files matching the *.test.json / *.test.toml naming convention are discovered, recursively. The .test. infix is required, so sibling files such as test_config.json (eval metrics) and the *.evalset_result.json files written by this helper are naturally excluded — no special-casing needed. A plain data.json without .test. is not picked up.
  • Single file: any .json or .toml file is accepted, since pointing at a file is an explicit choice. If the path does not contain .test., a logging.warning is emitted (under the pytest_adk.evaluation logger) noting that it falls outside the naming convention, and the file is loaded anyway. The loader is chosen by extension: .toml → TOML, otherwise JSON.

TOML is handy when a user prompt spans multiple lines: TOML multi-line strings ("""...""") keep newlines readable, instead of JSON's \n-escaped one-liners. Like JSON, TOML is parsed with the standard library (tomllib, Python 3.11+; on Python 3.10 the tomli backport is installed automatically as a dependency).

A *.test.toml evalset follows the same EvalSet schema as JSON:

eval_set_id = "home_automation"

[[eval_cases]]
eval_id = "turn_on_living_room"

[[eval_cases.conversation]]
invocation_id = "inv-1"

[eval_cases.conversation.user_content]
role = "user"
parts = [ { text = """
Please turn on the living room light.
Then confirm it is on.
""" } ]

[eval_cases.conversation.final_response]
role = "model"
parts = [ { text = "The living room light is now on." } ]

Notes:

  • TOML evalsets support the current EvalSet schema only; the legacy data format and a separate initial_session file (both JSON-only in google-adk) are not handled. Express the initial session inside the EvalSet instead.
  • The companion test_config.json (eval metrics / criteria) is unchanged; only the evalset data file gains TOML support.

Prompt templates

When several eval cases share the same (often long) prompt, you can keep the prompt in a separate file and reference it from a text field. If the entire value of a text field is a <prompt:...> marker, AgentEvaluator.evaluate reads the referenced file, substitutes its variables, and replaces the marker with the rendered prompt before the evalset reaches the evaluator.

Marker syntax:

<prompt:FILENAME [KEY=VALUE ...]>

Given prompt.txt:

Please turn on the ${ROOM} light.
Then confirm it is ${STATE}.

an evalset can reference it like this:

[eval_cases.conversation.user_content]
role = "user"
parts = [ { text = "<prompt:prompt.txt ROOM=living STATE=on>" } ]

After expansion the agent sees the fully rendered prompt. This works for both *.test.toml and *.test.json evalsets, and applies to both user_content and final_response text parts.

Details:

  • Variables use string.Template syntax by default: ${VAR} (or $VAR).
  • FILENAME is resolved relative to the evalset file's directory.
  • The marker must be the whole text value (leading/trailing whitespace is ignored); markers embedded inside other text are not expanded.
  • KEY=VALUE pairs are space-separated, so values cannot contain spaces.
  • It is an error if the prompt file is missing, a KEY=VALUE pair is malformed, or the prompt references a variable that the marker does not provide.

Jinja prompt templates

By default the prompt file is rendered with string.Template (${VAR}). To use Jinja2 ({{ VAR }}) syntax instead, install the optional extra and opt in via the pytest_adk_prompt_template_engine ini option in pyproject.toml:

pip install "pytest-adk[jinja]"
[tool.pytest.ini_options]
pytest_adk_prompt_template_engine = "jinja"

With the Jinja engine selected, the same prompt.txt would be written as:

Please turn on the {{ ROOM }} light.
Then confirm it is {{ STATE }}.

The marker syntax (<prompt:FILENAME KEY=VALUE ...>) is unchanged; only the placeholder syntax inside the prompt file differs. Referencing a variable that the marker does not provide is an error (Jinja runs with StrictUndefined).

Generate an evalset template

Use pytest-adk-eval-schema to generate a minimal EvalSet file with REPLACE_ME placeholders:

pytest-adk-eval-schema -o tests/evals/example.test.toml

TOML is the default output format. JSON is also available:

pytest-adk-eval-schema --format json

The command refuses to overwrite an existing file unless you pass --force. The same generator is available from Python:

from pytest_adk import eval_set_template

template = eval_set_template("toml")

Resume an exported ADK session

load_session_from_json reads a session exported by ADK from either a file path or a raw JSON string. runner_from_exported_session restores that session into an in-memory ADK Runner, copying the exported state and replaying events via the session service.

from pathlib import Path

from google.genai import types
from pytest_adk import runner_from_exported_session
from your_agent.agent import root_agent


async def test_resume_exported_session():
    runner, session = await runner_from_exported_session(
        root_agent,
        Path("tests/fixtures/roll_die.session.json"),
    )

    events = runner.run_async(
        user_id=session.user_id,
        session_id=session.id,
        new_message=types.Content(
            role="user",
            parts=[types.Part(text="What numbers did I get?")],
        ),
    )
    async for _ in events:
        pass

You can override app_name, user_id, or session_id when restoring, and you can pass custom artifact, memory, or credential services. If you do not provide services, in-memory ADK services are used.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pytest_adk-0.0.6.tar.gz (14.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pytest_adk-0.0.6-py3-none-any.whl (17.6 kB view details)

Uploaded Python 3

File details

Details for the file pytest_adk-0.0.6.tar.gz.

File metadata

  • Download URL: pytest_adk-0.0.6.tar.gz
  • Upload date:
  • Size: 14.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for pytest_adk-0.0.6.tar.gz
Algorithm Hash digest
SHA256 813ed189fd5c57377673ed8d6127b90d825f950f4a9e907778c2fcb8fa08d609
MD5 51c634cf5f5940d9fa71b5af2680bdd9
BLAKE2b-256 b70597763281e356cc0a222410eea8417cea6c14887bc6c194e7b742f86c96d7

See more details on using hashes here.

Provenance

The following attestation bundles were made for pytest_adk-0.0.6.tar.gz:

Publisher: publish.yml on ftnext/pytest-adk

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file pytest_adk-0.0.6-py3-none-any.whl.

File metadata

  • Download URL: pytest_adk-0.0.6-py3-none-any.whl
  • Upload date:
  • Size: 17.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for pytest_adk-0.0.6-py3-none-any.whl
Algorithm Hash digest
SHA256 e2a9f5b06b5c609a2daab79fec190b2a907d46d1aff5a1cd5acd221f291769ae
MD5 f3dc64765d21b662aa0bad12b747379d
BLAKE2b-256 47046616b7322d39ea25a5f2269fb5c0d4e9c69985c208a37a919f652c0e6c5c

See more details on using hashes here.

Provenance

The following attestation bundles were made for pytest_adk-0.0.6-py3-none-any.whl:

Publisher: publish.yml on ftnext/pytest-adk

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page