7 projects
autojudge-evaluate
Evaluation tools for TREC AutoJudge: meta-evaluate, qrel-evaluate, leaderboard statistics
autojudge-base
Core infrastructure for implementing TREC AutoJudge systems
minima-llm
Minimal async LLM backend with caching and batch execution
autojudge-annotate
Manual relevance annotation tools for TREC AutoJudge
rubric-trec-rag
RUBRIC Autograder Workbench for evaluating retrieval, generation, and RAG information systems
exam-pp
RUBRIC Autograder Workbench for evaluating retrieval, generation, and RAG information systems
trec-car-tools
Support tools for TREC CAR participants. Also see trec-car.cs.unh.edu