Last released Jun 18, 2026
Evaluate and compare AI agent setups through experiments, inspections, and rubric scoring.
Supported by