4 projects
widget2code-bench-exp
Benchmark evaluation for widget code generation — 12 quality metrics across layout, legibility, perceptual, style, and geometry.
widget2code-bench
Benchmark evaluation for widget code generation — 12 quality metrics across layout, legibility, perceptual, style, and geometry.
widget-eval
Evaluation pipeline for widget generation — runs quality metrics and generates statistics reports.
provider-hub
Unified interface for accessing multiple LLM providers