3 projects
multivon-eval
AI evaluation for teams that ship models to production
pdfhell
PDF Hell — adversarial PDFs that break AI document readers. Procedural ground truth, not LLM-as-judge.
multivon-mcp
MCP server exposing multivon-eval + pdfhell as agent-callable tools. Drop into Claude Desktop, Cursor, Cline, or any MCP-compatible AI coding agent.