5 projects
cane-gpu-perf
GPU inference benchmarking with opinionated diagnostics
cane-robotics
Foundation Model Active Learning for autonomous robot object discovery
cane-ai
Open-source agentic infrastructure. Build, eval, fine-tune, and deploy AI agents.
cane-personality
Behavioral profiling benchmark for LLMs. Profile any model's personality, extract steering vectors, generate DPO training pairs.
cane-eval
Agent Reliability Layer. LLM-as-Judge eval, schema validation, latency tracking, and reliability scoring for AI agents.