A package for sketched ridgeless estimator simulations, optimizing generalization. Identify the best sketching size to minimize out-of-sample risks. Stable risk curves in optimally sketched estimator eliminate peaks found in full-sample estimator. SRLR offers practical method to discover the ideal sketching size.
Project description
SRLR
Sketched Ridgeless Linear Regression
Description
This repository presents numerical simulations that analyze the empirical risks of the sketched ridgeless estimator, aiming to enhance generalization performance. The simulations focus on determining optimal sketching sizes that minimize out-of-sample prediction risks. The results reveal that the optimally sketched estimator exhibits stable risk curves, effectively eliminating the peaks observed in the full-sample estimator. Additionally, we introduce a practical procedure to empirically identify the optimal sketching size.
Suppose we observe data vectors (xi,yi) that follow a linear model yi=xiTβ+εi, i=1,...n, where yi is a univariate response, xi is a d-dimensional predictor, β denotes the vector of regression coefficients, and εi is a random error. We consider the ridgeless least square estimator β̂=(XTX)+XTY.
With this package, the simulation results in this paper can be reporduced.
Examples
Please refer to tutorial.ipynb for a comprehensive example and step-by-step guide.
Reference
Chen, X., Zeng, Y., Yang, S. and Sun, Q. Sketched Ridgeless Linear Regression: The Role of Downsampling. Paper
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.