Last released Apr 2, 2026
CLI for composing simulation environments, running agents, and evaluating tasks
Last released Feb 27, 2026
Long-horizon deterministic benchmark for LLM agents — CEO of an AI startup
Supported by