Standalone Agent Evaluation Framework (AEF)

These details have not been verified by PyPI

Project links

Project description

AEF - Agent Evaluation Framework

AEF is a framework to generate tests, run/evaluate trajectories, collect feedback, and self-evolve agent behavior.

The workflow is intentionally minimal and framework-agnostic:

aef generate calls the generation component/tool
aef evaluate calls the evaluation component/tool
aef feedback calls the feedback component/tool
aef evolve calls the evolution component/tool

Internally, these are routed through an A2A bus so the same flow works for sub-agents implemented with different frameworks.

Installation

From PyPI (Coming Soon)

Once published, install via pip or uv:

pip install aef-framework

or with uv:

uv pip install aef-framework

Local Development Install with uv

AEF uses uv for fast, reliable Python package management.

1. Install uv (if not already installed)

curl -LsSf https://astral.sh/uv/install.sh | sh

2. Create a virtual environment

cd AEF
uv venv --python=3.11

This creates a .venv directory with Python 3.11 (or use 3.10, 3.12 as needed).

3. Activate the virtual environment

source .venv/bin/activate  # Linux/macOS
# or
.venv\Scripts\activate     # Windows

4. Install AEF in editable mode

uv pip install -e .

This installs AEF and all dependencies, making the aef command available.

5. Verify installation

aef --help

Traditional pip install (local)

If you prefer using pip:

python -m venv .venv
source .venv/bin/activate
pip install -e .

Core Principles

Universal sub-agent support via adapter contract (python, cli, http)
Single essential loop: Generate → Evaluate → Feedback → Evolve
Composable A2A components instead of tightly-coupled command logic
Versioned evolution profiles with before/after evaluation comparison

Essential Workflow

1) Generate trajectories

aef generate --config configs/fleet_ccc_run.json --n 10

2) Evaluate against a golden run

aef evaluate --config configs/fleet_ccc_run.json --golden run_YYYYMMDD_xxxxxx

3) Submit feedback

aef feedback --agent fleet_ccc --text "Agent should ask confirmation before delete operations"

4) Evolve (auto-apply + compare)

aef evolve --config configs/fleet_ccc_run.json --n 10

aef evolve now performs:

baseline evaluate
classify feedback into amendments
apply evolution profile
re-evaluate and report before/after score delta

Use AEF With Any Sub-Agent

Set agent.adapter_type in your config:

python: ADK/Python agent entrypoint module_or_file.py:agent_var
cli: shell command template using {step} / {goal} placeholders
http: endpoint that accepts { goal, step, session_id? }

See detailed usage in docs/USING_ANY_SUBAGENT.md.

Full prerequisites and onboarding checklist:

docs/ADOPTING_NEW_AGENT.md

A2A Components

AEF components exposed through the internal bus:

generation.generate
evaluation.evaluate
feedback.submit_text
feedback.submit_annotations
evolution.evolve

See docs/A2A_COMPONENTS.md.

Evolution Outputs

Evolution applies and versions runtime amendments per agent under:

prompts/evolution_profiles/<agent>/latest.json
prompts/evolution_profiles/<agent>/profile_<timestamp>.json

These profiles contain:

prompt addenda
tool policies
generator hints
agent hints
rubric updates

See docs/SELF_EVOLUTION.md.

Minimal Command Reference

# Generate
aef generate --config <config.json> --n 10

# Direct A2A tool call
aef a2a --config <config.json> --component generation --tool generate --payload '{"n": 2}'

# Evaluate golden by run id
aef evaluate --config <config.json> --golden <run_id>

# Feedback
aef feedback --agent <agent_name> --text "..."

# Evolve
aef evolve --config <config.json> --n 10

# Compare two eval runs
aef compare --run <run_a> --vs <run_b>

# Query runs / memory
aef query runs --agent <agent_name>
aef query memory --agent <agent_name> --all-memory
aef query memory --agent <agent_name> --history

Documentation

docs/AEF_WORKFLOW.md
docs/A2A_COMPONENTS.md
docs/USING_ANY_SUBAGENT.md
docs/SELF_EVOLUTION.md
docs/PUBLISHING.md - PyPI package publishing guide

Contributing

Contributions are welcome! See CONTRIBUTING.md for development setup and guidelines.

License

AEF is released under the Apache License 2.0. See LICENSE for details.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.1.5

May 6, 2026

0.1.2

Mar 20, 2026

This version

0.1.1

Mar 20, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

aef_framework-0.1.1.tar.gz (1.9 MB view details)

Uploaded Mar 20, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

aef_framework-0.1.1-py3-none-any.whl (80.0 kB view details)

Uploaded Mar 20, 2026 Python 3

File details

Details for the file aef_framework-0.1.1.tar.gz.

File metadata

Download URL: aef_framework-0.1.1.tar.gz
Upload date: Mar 20, 2026
Size: 1.9 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for aef_framework-0.1.1.tar.gz
Algorithm	Hash digest
SHA256	`a06156357697abe955c91233a3fcae5db4f8fa337fd30a6fca1878e65e1e9e14`
MD5	`af79db1d0866033c64d56471ed76b53c`
BLAKE2b-256	`4a339be3b83a6debb8e7fa0d4f39ae9ff4d4e295f2056ef6b6b2d6604eaad934`

See more details on using hashes here.

File details

Details for the file aef_framework-0.1.1-py3-none-any.whl.

File metadata

Download URL: aef_framework-0.1.1-py3-none-any.whl
Upload date: Mar 20, 2026
Size: 80.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for aef_framework-0.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`1fcb812893f6b77f63c5af05da0a60e9491cec98a68e182dc76a6cf864a0fda9`
MD5	`86fde6e190ab59bf974c3e3c942b7121`
BLAKE2b-256	`7bb4b60a9e85a341f1995b651bd17c5b8ca26889e635aefd9349042d89e38a43`

See more details on using hashes here.

aef-framework 0.1.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

AEF - Agent Evaluation Framework

Installation

From PyPI (Coming Soon)

Local Development Install with uv

1. Install uv (if not already installed)

2. Create a virtual environment

3. Activate the virtual environment

4. Install AEF in editable mode

5. Verify installation

Traditional pip install (local)

Core Principles

Essential Workflow

1) Generate trajectories

2) Evaluate against a golden run

3) Submit feedback

4) Evolve (auto-apply + compare)

Use AEF With Any Sub-Agent

A2A Components

Evolution Outputs

Minimal Command Reference

Documentation

Contributing

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes