Profile of sgjarmak

Some features may not work without JavaScript. Please try enabling it if you encounter problems.

2 projects

Last released May 11, 2026

Benchmark AI coding agents against your own codebase. Mine real tasks from repo history, run agents, interpret results.

Last released Apr 19, 2026

Agent Reliability Observatory — a behavioral taxonomy and annotation framework for analyzing why coding agents succeed or fail.

Supported by