Last released Apr 9, 2026
A fork of Terminal-bench for evaluating FormulaCode.
Python toolchain for building and maintaining FormulaCode benchmark tasks.
Last released Oct 17, 2024
Efficient LLM inference on Slurm clusters using vLLM (for TACC).
Supported by