Last released Jun 11, 2026
Localized model benchmarking with receipts: run head-to-head evals on your own data, locally, and turn them into shareable proof reports
Supported by