Last released Jun 12, 2026
A benchmark and CLI measuring whether analytics agents are business-correct, not merely execution-correct.
Last released Apr 7, 2026
Slice-based model evaluation for ML engineers. Find where your model fails before production does.
Supported by