Last released Jun 30, 2022
A collaborative benchmark intended to probe large language models and extrapolate their future capabilities.
Supported by