Profile of YoavAlro

Some features may not work without JavaScript. Please try enabling it if you encounter problems.

1 project

quetzal-eval

Last released Jul 2, 2026

Measure how well and how cheaply a coding-agent harness answers questions about your codebase. Drives real agent CLIs (Claude Code, Codex, Cursor), judges answers against ground truth, and reports accuracy + token cost per suite.

Supported by

AWS Cloud computing and Security Sponsor

Datadog Monitoring

Depot Continuous Integration

Fastly CDN

Google Download Analytics

Pingdom Monitoring

Sentry Error logging

StatusPage Status page

Yoav Alroy

1 project

quetzal-eval