5 projects
goldensetauditor
Audit golden evaluation datasets for LLM/RAG application quality risks.
featureleakagelens
Pre-training leakage audit reports for tabular ML datasets.
trialcheck
Platform-agnostic A/B experiment readout auditor with SRM, peeking, MDE, practical significance, Welch t-test, guardrail, and pre-period balance checks.
docingestqa
Pre-indexing QA auditor for RAG document ingestion pipelines
metriclens
DataFrame-native metric movement decomposition for business metrics.