4 projects
effgen
A comprehensive framework for building agents with Small Language Models
beyondbench
BeyondBench: Contamination-Resistant Evaluation of Reasoning in Language Models
llmthinkbench
A framework for evaluating overthinking and basic reasoning capabilities of Large Language Models
genetic-optimization
Genetic Optimization package