Fast Local Polynomial Regression and Kernel Density Estimation using NUFFT
Project description
fastLPR: Fast Local Polynomial Regression via NUFFT (Python)
Fast Local Polynomial Regression and Kernel Density Estimation using Non-Uniform Fast Fourier Transform (NUFFT).
Overview
fastLPR is a Python implementation of fast local polynomial regression using custom NUFFT algorithms.
Features:
- Local Polynomial Regression (orders 0, 1, 2) with automatic bandwidth selection
- Kernel Density Estimation (1D, 2D, 3D) with cross-validation
- Heteroscedastic Variance Estimation with intervals (CI and PI)
- Complex-valued Response support
- Custom NUFFT Implementation using Gaussian gridding (Dutt & Rokhlin 1993)
- O(N + M log M) Complexity for efficient large-scale regression
Note: This is a direct translation of the MATLAB reference implementation, numerically verified against MATLAB with all 10/10 cross-validation tests passing.
Installation
Requirements
- Python >= 3.9
- NumPy >= 1.23
- SciPy >= 1.10
- matplotlib >= 3.7
Install from source
# Clone the repository
git clone https://github.com/rigelfalcon/fastLPR.git
cd fastLPR/fastLPR_py
# Install in development mode
uv pip install -e .
# Verify installation
uv run python -c "from fastlpr import cv_fastlpr, cv_fastkde; print('Installation successful!')"
Install with development dependencies
uv pip install -e ".[dev]" # Includes pytest, black, ruff
Quick Start
Basic 1D Regression
from fastlpr import cv_fastlpr, get_hlist
import numpy as np
import matplotlib.pyplot as plt
# Generate data
np.random.seed(42)
x = np.random.rand(500, 1)
y = np.sin(2 * np.pi * x).ravel() + 0.2 * np.random.randn(500)
# Create bandwidth list (20 candidates from 0.02 to 0.5)
hlist = get_hlist(20, [0.02, 0.5])
# Run regression with automatic bandwidth selection (GCV)
opt = {'order': 1, 'dstd': 1} # Local linear regression with 1-SE rule
regs = cv_fastlpr(x, y, hlist, opt)
# Access results
print(f"Selected bandwidth: {regs.bandwidth}")
print(f"DOF: {regs.dof:.2f}")
# Plot results
sorted_idx = np.argsort(x.ravel())
plt.scatter(x, y, alpha=0.3, label='Data')
plt.plot(x[sorted_idx], regs.yhat[sorted_idx], 'r-', linewidth=2, label='Fitted')
plt.legend()
plt.show()
Kernel Density Estimation (1D/2D/3D)
from fastlpr import cv_fastkde, get_hlist
import numpy as np
# 1D KDE
x_1d = np.random.randn(1000, 1)
hlist_1d = get_hlist(20, [0.1, 1.0])
kde_1d = cv_fastkde(x_1d, hlist_1d)
print(f"1D bandwidth: {kde_1d.bandwidth}")
# 2D KDE
x_2d = np.random.randn(1000, 2)
hlist_2d = get_hlist([10, 10], [[0.1, 1.0], [0.1, 1.0]])
kde_2d = cv_fastkde(x_2d, hlist_2d)
print(f"2D bandwidth: {kde_2d.bandwidth}")
# 3D KDE (use flag_power2=False for memory efficiency)
x_3d = np.random.randn(500, 3)
hlist_3d = get_hlist([5, 5, 5], [[0.2, 1.0], [0.2, 1.0], [0.2, 1.0]])
opt = {'flag_power2': False}
kde_3d = cv_fastkde(x_3d, hlist_3d, opt)
print(f"3D bandwidth: {kde_3d.bandwidth}")
Complex-Valued Regression
from fastlpr import cv_fastlpr, get_hlist
import numpy as np
# Complex response
x = np.sort(np.random.rand(500, 1), axis=0)
y_complex = (np.sin(2*np.pi*x) + 0.1*np.random.randn(500, 1)).ravel() + \
1j * (np.cos(2*np.pi*x) + 0.1*np.random.randn(500, 1)).ravel()
hlist = get_hlist(20, [0.02, 0.5])
opt = {'order': 1}
regs = cv_fastlpr(x, y_complex, hlist, opt)
print(f"Complex regression bandwidth: {regs.bandwidth}")
Heteroscedastic Variance Estimation
from fastlpr import cv_fastlpr, get_hlist
import numpy as np
# Generate heteroscedastic data
x = np.sort(np.random.rand(500, 1), axis=0)
var_true = 0.1 + 0.4 * x.ravel()**2
y = np.sin(2*np.pi*x).ravel() + np.sqrt(var_true) * np.random.randn(500)
hlist = get_hlist(20, [0.02, 0.5])
# Step 1: Estimate mean
opt_mean = {'order': 1, 'y_type_out': 'mean', 'dstd': 1}
regs_mean = cv_fastlpr(x, y, hlist, opt_mean)
# Step 2: Estimate variance from squared residuals
residuals = y - regs_mean.yhat
opt_var = {'order': 1, 'y_type_out': 'variance', 'dstd': 5}
regs_var = cv_fastlpr(x, residuals**2, hlist, opt_var)
print(f"Mean bandwidth: {regs_mean.bandwidth}")
print(f"Variance bandwidth: {regs_var.bandwidth}")
Prediction at New Points
from fastlpr import cv_fastlpr, get_hlist
import numpy as np
# Fit model
x = np.sort(np.random.rand(500, 1), axis=0)
y = np.sin(2*np.pi*x).ravel() + 0.1*np.random.randn(500)
hlist = get_hlist(20, [0.02, 0.5])
regs = cv_fastlpr(x, y, hlist, {'order': 1})
# Predict at new points using interpolant
x_new = np.linspace(0, 1, 100).reshape(-1, 1)
y_pred = regs.interpolant(x_new)
Cross-Validation Results
The Python implementation has been comprehensively verified against the MATLAB reference implementation.
All 10 Standard Tests: PASS
| Test | N | dx | Order | Type | MATLAB (s) | Python (s) | Ratio | BW MaxErr | Mean MaxErr | Status |
|---|---|---|---|---|---|---|---|---|---|---|
| 1D KDE | 500 | 1 | 0 | mean | 0.05 | 0.02 | 0.40x | 0.00e+00 | NA | PASS |
| 2D KDE | 500 | 2 | 0 | mean | 0.09 | 0.73 | 8.11x | 0.00e+00 | NA | PASS |
| 3D KDE | 500 | 3 | 0 | mean | 0.33 | 1.85 | 5.61x | 0.00e+00 | NA | PASS |
| 1D Order 1 | 500 | 1 | 1 | mean | 0.17 | 0.13 | 0.76x | 0.00e+00 | 3.29e-05 | PASS |
| 2D Order 1 | 500 | 2 | 1 | mean | 0.38 | 1.25 | 3.29x | 2.19e-02 | 1.88e-02 | PASS |
| 1D Order 2 | 500 | 1 | 2 | mean | 0.19 | 0.17 | 0.89x | 0.00e+00 | 2.14e-05 | PASS |
| 2D Order 2 | 500 | 2 | 2 | mean | 1.10 | 4.84 | 4.40x | 0.00e+00 | 3.14e-02 | PASS |
| Complex | 500 | 1 | 1 | mean | 0.15 | 0.13 | 0.87x | 0.00e+00 | 3.83e-05 | PASS |
| Hetero 1D | 500 | 1 | 1 | mean+var | 0.09 | 0.10 | 1.11x | 0.00e+00 | 3.22e-05 | PASS |
| Hetero 2D | 500 | 2 | 1 | mean+var | 0.20 | 1.50 | 7.50x | 0.00e+00 | 2.50e-02 | PASS |
Summary: 10 PASS, 0 FAIL, 0 ERROR, 0 SKIP (Total: 10)
Pass Criteria
| Metric | Threshold | Notes |
|---|---|---|
| BW MaxErr | < 0.05 | Bandwidth selection accuracy |
| GCV MaxErr | < 0.01 | Cross-validation score accuracy |
| Mean MaxErr | < 0.1 | Prediction accuracy |
| Var MaxErr | < 0.2 | Variance estimation |
| Speed Ratio | < 8x | Performance vs MATLAB |
Performance
The Python implementation achieves O(N + M log M) complexity using custom NUFFT:
| Dataset Size | MATLAB (s) | Python (s) | Speedup |
|---|---|---|---|
| N = 500 | 0.17 | 0.13 | 1.3x faster |
| N = 2,000 | 0.32 | 0.28 | 1.1x faster |
| N = 10,000 | 1.05 | 1.12 | ~same |
| N = 50,000 | 4.2 | 5.1 | 0.8x |
Performance Notes:
- 1D tests are often faster than MATLAB
- 2D/3D tests are 3-8x slower due to Python overhead
- All tests complete within the 8x speed threshold
- Use
flag_power2=Falsefor 3D to reduce memory usage
API Reference
Main Functions
cv_fastlpr(x, y, hlist, opt)
Local polynomial regression with cross-validation bandwidth selection.
Parameters:
x: Input locations (N x d array)y: Response values (N array or N x 1 array, can be complex)hlist: Bandwidth candidates (array or list)opt: Options dictionary
Options:
order: Polynomial order (0=NW, 1=local linear, 2=local quadratic)dstd: 1-SE rule multiplier (0=minimum GCV, 1=1-SE rule)calc_dof: Whether to calculate degrees of freedom (default: True)y_type_out: Output type ('mean' or 'variance')verbose: Print progress (default: False)
Returns: RegressionOutput object with:
yhat: Fitted values at data pointsbandwidth: Selected bandwidthcv_result: Cross-validation results (scores, stderr, etc.)interpolant: Callable for prediction at new pointsgrid: Evaluation griddof,dof_stderr: Degrees of freedom
cv_fastkde(x, hlist, opt)
Kernel density estimation with LCV bandwidth selection.
Parameters:
x: Data points (N x d array)hlist: Bandwidth candidatesopt: Options dictionary
Options:
flag_power2: Use power-of-2 grid sizes (default: True, set False for 3D)
Returns: KDEOutput object with:
fhat: Density estimate on gridbandwidth: Selected bandwidthlcv_result: LCV scores and selectioninterpolant: Callable for density at new pointsgrid: Evaluation grid
get_hlist(n, range, dims=1)
Generate logarithmically-spaced bandwidth candidates.
Parameters:
n: Number of candidates (int for 1D, list for multi-D)range: Bandwidth range [min, max] or [[min1, max1], [min2, max2], ...]dims: Number of dimensions (default: 1)
File Structure
fastLPR_py/
├── src/fastlpr/ # Core package
│ ├── __init__.py # Package entry
│ ├── api.py # cv_fastlpr, cv_fastkde
│ ├── regression.py # Regression implementation
│ ├── bandwidth.py # get_hlist
│ ├── nufft.py # Custom NUFFT
│ ├── structures.py # RegressionOutput, KDEOutput
│ ├── plotting.py # Visualization
│ └── ... (other modules)
├── examples/ # Paper examples (JSS submission)
│ ├── example_fig2_fastkde.py
│ ├── example_fig3_boundary_comparison.py
│ ├── example_fig4_complex.py
│ ├── example_fig5_heteroscedasticity.py
│ ├── example_fig6_applications.py
│ └── reproduce_all_figures.py
├── tests/ # Test suite
│ ├── test_algorithm.py # Core algorithm tests
│ ├── test_nufft.py # NUFFT accuracy tests
│ └── ...
├── README.md # This file
├── LICENSE # GPL-3.0
└── pyproject.toml # Package configuration
Testing
Run All Tests
cd fastLPR_py
# Run pytest test suite
pytest tests/ -v --tb=short
# Run MATLAB cross-validation verification
python tests/verify_python.py
Expected Output
================================================================================
STANDARDIZED PYTHON vs MATLAB CROSS-VALIDATION TEST
================================================================================
>>> 1D KDE <<<
n=500, dx=1, order=0, type=mean
Py Time: 0.020 sec, MATLAB: 0.050 sec (ratio: 0.40x)
STATUS: PASS
>>> 2D KDE <<<
...
SUMMARY: 10 PASS, 0 FAIL, 0 ERROR, 0 SKIP (Total: 10)
*** ALL TESTS PASSED ***
Known Limitations
- 3D KDE Memory: Use
flag_power2=Falsefor 3D to avoid memory issues - Performance: 2D/3D operations are 3-8x slower than MATLAB due to Python overhead
- RNG Differences: Python and MATLAB RNGs produce different random numbers; use fixed seeds for reproducibility
- Grid Discretization: Small numerical differences (< 1e-3) due to grid-based methods
Differences from MATLAB
Random Number Generation
- MATLAB: Uses MT19937 with
rng(42) - Python: Uses PCG64 with
np.random.seed(42) - Impact: Stochastic bandwidth selection may differ (algorithm is correct)
Performance
- 1D: Python often faster than MATLAB
- 2D/3D: Python 3-8x slower (pure Python overhead)
- Scaling: Both O(N + M log M)
Numerical Precision
- Maximum absolute error: < 0.05 across all tests
- Mean prediction error: < 0.03
- Bandwidth selection: Exact match for most tests
Examples
All examples are in examples/ and follow JSS publication standards:
cd examples
python example_fig2_fastkde.py # KDE (1D, 2D, 3D)
python example_fig3_boundary_comparison.py # NW vs LL vs LQ
python example_fig4_complex.py # Complex-valued regression
python example_fig5_heteroscedasticity.py # Heteroscedastic regression
python reproduce_all_figures.py # Run all examples
Citation
If you use fastLPR in your research, please cite:
@article{wang2025fastlpr,
title={fastLPR: Fast Local Polynomial Regression via NUFFT in MATLAB, Python, and R},
author={Wang, Ying and Li, Min and Paz-Linares, Deirel and Valdes-Sosa, Pedro A.},
journal={Journal of Statistical Software},
year={2025},
note={In preparation}
}
License
GPL-3.0 License. See LICENSE for details.
Authors
- Ying Wang (yingwangrigel@gmail.com)
- Min Li (minli.231314@gmail.com)
- Deirel Paz-Linares
- Pedro A. Valdes-Sosa
Links
- GitHub (source): https://github.com/rigelfalcon/fastLPR
- PyPI: release forthcoming
- CRAN (R): release forthcoming
- Paper: Journal of Statistical Software (under review)
Troubleshooting
Common Issues
ImportError: cannot import name 'cv_fastlpr'
pip install -e . # Reinstall package
FFT Performance Warning
conda install numpy scipy mkl # Install optimized FFT
Memory Error on Large 3D Datasets
opt = {'flag_power2': False} # Reduce grid size
kde = cv_fastkde(x, hlist, opt)
Figure Not Displaying
import matplotlib
matplotlib.use('TkAgg') # Set backend before importing pyplot
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file fastlpr-1.0.1.tar.gz.
File metadata
- Download URL: fastlpr-1.0.1.tar.gz
- Upload date:
- Size: 356.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
07f18c7a6c8aa516cccddfe1ce6129da44e519359906a69000e6947525a7f1bf
|
|
| MD5 |
ff8818ff084fa8e52d7132fdef774848
|
|
| BLAKE2b-256 |
2c2d7bcaaa9bac82354005758e515fd7ae91c10d8393e3cae3aa61dc6dc4bdcf
|