Last released May 20, 2026
Friendly evaluation and regression-testing framework for AI agents: inspectable traces, graded outcomes, baseline comparisons, and CI-ready reliability signals.
Supported by