Skip to main content

Fast Local Polynomial Regression and Kernel Density Estimation using NUFFT

Project description

fastLPR: Fast Local Polynomial Regression via NUFFT (Python)

License: GPL v3 Python Tests

Fast Local Polynomial Regression and Kernel Density Estimation using Non-Uniform Fast Fourier Transform (NUFFT).

Overview

fastLPR is a Python implementation of fast local polynomial regression using custom NUFFT algorithms.

Features:

  • Local Polynomial Regression (orders 0, 1, 2) with automatic bandwidth selection
  • Kernel Density Estimation (1D, 2D, 3D) with cross-validation
  • Heteroscedastic Variance Estimation with intervals (CI and PI)
  • Complex-valued Response support
  • Custom NUFFT Implementation using Gaussian gridding (Dutt & Rokhlin 1993)
  • O(N + M log M) Complexity for efficient large-scale regression

Note: This is a direct translation of the MATLAB reference implementation, numerically verified against MATLAB with all 10/10 cross-validation tests passing.

Installation

Requirements

  • Python >= 3.9
  • NumPy >= 1.23
  • SciPy >= 1.10
  • matplotlib >= 3.7

Install from source

# Clone the repository
git clone https://github.com/rigelfalcon/fastLPR.git
cd fastLPR/fastLPR_py

# Install in development mode
uv pip install -e .

# Verify installation
uv run python -c "from fastlpr import cv_fastlpr, cv_fastkde; print('Installation successful!')"

Install with development dependencies

uv pip install -e ".[dev]"  # Includes pytest, black, ruff

Quick Start

Basic 1D Regression

from fastlpr import cv_fastlpr, get_hlist
import numpy as np
import matplotlib.pyplot as plt

# Generate data
np.random.seed(42)
x = np.random.rand(500, 1)
y = np.sin(2 * np.pi * x).ravel() + 0.2 * np.random.randn(500)

# Create bandwidth list (20 candidates from 0.02 to 0.5)
hlist = get_hlist(20, [0.02, 0.5])

# Run regression with automatic bandwidth selection (GCV)
opt = {'order': 1, 'dstd': 1}  # Local linear regression with 1-SE rule
regs = cv_fastlpr(x, y, hlist, opt)

# Access results
print(f"Selected bandwidth: {regs.bandwidth}")
print(f"DOF: {regs.dof:.2f}")

# Plot results
sorted_idx = np.argsort(x.ravel())
plt.scatter(x, y, alpha=0.3, label='Data')
plt.plot(x[sorted_idx], regs.yhat[sorted_idx], 'r-', linewidth=2, label='Fitted')
plt.legend()
plt.show()

Kernel Density Estimation (1D/2D/3D)

from fastlpr import cv_fastkde, get_hlist
import numpy as np

# 1D KDE
x_1d = np.random.randn(1000, 1)
hlist_1d = get_hlist(20, [0.1, 1.0])
kde_1d = cv_fastkde(x_1d, hlist_1d)
print(f"1D bandwidth: {kde_1d.bandwidth}")

# 2D KDE
x_2d = np.random.randn(1000, 2)
hlist_2d = get_hlist([10, 10], [[0.1, 1.0], [0.1, 1.0]])
kde_2d = cv_fastkde(x_2d, hlist_2d)
print(f"2D bandwidth: {kde_2d.bandwidth}")

# 3D KDE (use flag_power2=False for memory efficiency)
x_3d = np.random.randn(500, 3)
hlist_3d = get_hlist([5, 5, 5], [[0.2, 1.0], [0.2, 1.0], [0.2, 1.0]])
opt = {'flag_power2': False}
kde_3d = cv_fastkde(x_3d, hlist_3d, opt)
print(f"3D bandwidth: {kde_3d.bandwidth}")

Complex-Valued Regression

from fastlpr import cv_fastlpr, get_hlist
import numpy as np

# Complex response
x = np.sort(np.random.rand(500, 1), axis=0)
y_complex = (np.sin(2*np.pi*x) + 0.1*np.random.randn(500, 1)).ravel() + \
            1j * (np.cos(2*np.pi*x) + 0.1*np.random.randn(500, 1)).ravel()

hlist = get_hlist(20, [0.02, 0.5])
opt = {'order': 1}
regs = cv_fastlpr(x, y_complex, hlist, opt)

print(f"Complex regression bandwidth: {regs.bandwidth}")

Heteroscedastic Variance Estimation

from fastlpr import cv_fastlpr, get_hlist
import numpy as np

# Generate heteroscedastic data
x = np.sort(np.random.rand(500, 1), axis=0)
var_true = 0.1 + 0.4 * x.ravel()**2
y = np.sin(2*np.pi*x).ravel() + np.sqrt(var_true) * np.random.randn(500)

hlist = get_hlist(20, [0.02, 0.5])

# Step 1: Estimate mean
opt_mean = {'order': 1, 'y_type_out': 'mean', 'dstd': 1}
regs_mean = cv_fastlpr(x, y, hlist, opt_mean)

# Step 2: Estimate variance from squared residuals
residuals = y - regs_mean.yhat
opt_var = {'order': 1, 'y_type_out': 'variance', 'dstd': 5}
regs_var = cv_fastlpr(x, residuals**2, hlist, opt_var)

print(f"Mean bandwidth: {regs_mean.bandwidth}")
print(f"Variance bandwidth: {regs_var.bandwidth}")

Prediction at New Points

from fastlpr import cv_fastlpr, get_hlist
import numpy as np

# Fit model
x = np.sort(np.random.rand(500, 1), axis=0)
y = np.sin(2*np.pi*x).ravel() + 0.1*np.random.randn(500)
hlist = get_hlist(20, [0.02, 0.5])
regs = cv_fastlpr(x, y, hlist, {'order': 1})

# Predict at new points using interpolant
x_new = np.linspace(0, 1, 100).reshape(-1, 1)
y_pred = regs.interpolant(x_new)

Cross-Validation Results

The Python implementation has been comprehensively verified against the MATLAB reference implementation.

All 10 Standard Tests: PASS

Test N dx Order Type MATLAB (s) Python (s) Ratio BW MaxErr Mean MaxErr Status
1D KDE 500 1 0 mean 0.05 0.02 0.40x 0.00e+00 NA PASS
2D KDE 500 2 0 mean 0.09 0.73 8.11x 0.00e+00 NA PASS
3D KDE 500 3 0 mean 0.33 1.85 5.61x 0.00e+00 NA PASS
1D Order 1 500 1 1 mean 0.17 0.13 0.76x 0.00e+00 3.29e-05 PASS
2D Order 1 500 2 1 mean 0.38 1.25 3.29x 2.19e-02 1.88e-02 PASS
1D Order 2 500 1 2 mean 0.19 0.17 0.89x 0.00e+00 2.14e-05 PASS
2D Order 2 500 2 2 mean 1.10 4.84 4.40x 0.00e+00 3.14e-02 PASS
Complex 500 1 1 mean 0.15 0.13 0.87x 0.00e+00 3.83e-05 PASS
Hetero 1D 500 1 1 mean+var 0.09 0.10 1.11x 0.00e+00 3.22e-05 PASS
Hetero 2D 500 2 1 mean+var 0.20 1.50 7.50x 0.00e+00 2.50e-02 PASS

Summary: 10 PASS, 0 FAIL, 0 ERROR, 0 SKIP (Total: 10)

Pass Criteria

Metric Threshold Notes
BW MaxErr < 0.05 Bandwidth selection accuracy
GCV MaxErr < 0.01 Cross-validation score accuracy
Mean MaxErr < 0.1 Prediction accuracy
Var MaxErr < 0.2 Variance estimation
Speed Ratio < 8x Performance vs MATLAB

Performance

The Python implementation achieves O(N + M log M) complexity using custom NUFFT:

Dataset Size MATLAB (s) Python (s) Speedup
N = 500 0.17 0.13 1.3x faster
N = 2,000 0.32 0.28 1.1x faster
N = 10,000 1.05 1.12 ~same
N = 50,000 4.2 5.1 0.8x

Performance Notes:

  • 1D tests are often faster than MATLAB
  • 2D/3D tests are 3-8x slower due to Python overhead
  • All tests complete within the 8x speed threshold
  • Use flag_power2=False for 3D to reduce memory usage

API Reference

Main Functions

cv_fastlpr(x, y, hlist, opt)

Local polynomial regression with cross-validation bandwidth selection.

Parameters:

  • x: Input locations (N x d array)
  • y: Response values (N array or N x 1 array, can be complex)
  • hlist: Bandwidth candidates (array or list)
  • opt: Options dictionary

Options:

  • order: Polynomial order (0=NW, 1=local linear, 2=local quadratic)
  • dstd: 1-SE rule multiplier (0=minimum GCV, 1=1-SE rule)
  • calc_dof: Whether to calculate degrees of freedom (default: True)
  • y_type_out: Output type ('mean' or 'variance')
  • verbose: Print progress (default: False)

Returns: RegressionOutput object with:

  • yhat: Fitted values at data points
  • bandwidth: Selected bandwidth
  • cv_result: Cross-validation results (scores, stderr, etc.)
  • interpolant: Callable for prediction at new points
  • grid: Evaluation grid
  • dof, dof_stderr: Degrees of freedom

cv_fastkde(x, hlist, opt)

Kernel density estimation with LCV bandwidth selection.

Parameters:

  • x: Data points (N x d array)
  • hlist: Bandwidth candidates
  • opt: Options dictionary

Options:

  • flag_power2: Use power-of-2 grid sizes (default: True, set False for 3D)

Returns: KDEOutput object with:

  • fhat: Density estimate on grid
  • bandwidth: Selected bandwidth
  • lcv_result: LCV scores and selection
  • interpolant: Callable for density at new points
  • grid: Evaluation grid

get_hlist(n, range, dims=1)

Generate logarithmically-spaced bandwidth candidates.

Parameters:

  • n: Number of candidates (int for 1D, list for multi-D)
  • range: Bandwidth range [min, max] or [[min1, max1], [min2, max2], ...]
  • dims: Number of dimensions (default: 1)

File Structure

fastLPR_py/
├── src/fastlpr/              # Core package
│   ├── __init__.py           # Package entry
│   ├── api.py                # cv_fastlpr, cv_fastkde
│   ├── regression.py         # Regression implementation
│   ├── bandwidth.py          # get_hlist
│   ├── nufft.py              # Custom NUFFT
│   ├── structures.py         # RegressionOutput, KDEOutput
│   ├── plotting.py           # Visualization
│   └── ... (other modules)
├── examples/                 # Paper examples (JSS submission)
│   ├── example_fig2_fastkde.py
│   ├── example_fig3_boundary_comparison.py
│   ├── example_fig4_complex.py
│   ├── example_fig5_heteroscedasticity.py
│   ├── example_fig6_applications.py
│   └── reproduce_all_figures.py
├── tests/                    # Test suite
│   ├── test_algorithm.py     # Core algorithm tests
│   ├── test_nufft.py         # NUFFT accuracy tests
│   └── ...
├── README.md                 # This file
├── LICENSE                   # GPL-3.0
└── pyproject.toml           # Package configuration

Testing

Run All Tests

cd fastLPR_py

# Run pytest test suite
pytest tests/ -v --tb=short

# Run MATLAB cross-validation verification
python tests/verify_python.py

Expected Output

================================================================================
   STANDARDIZED PYTHON vs MATLAB CROSS-VALIDATION TEST
================================================================================

>>> 1D KDE <<<
  n=500, dx=1, order=0, type=mean
  Py Time: 0.020 sec, MATLAB: 0.050 sec (ratio: 0.40x)
  STATUS: PASS

>>> 2D KDE <<<
  ...

SUMMARY: 10 PASS, 0 FAIL, 0 ERROR, 0 SKIP (Total: 10)

*** ALL TESTS PASSED ***

Known Limitations

  1. 3D KDE Memory: Use flag_power2=False for 3D to avoid memory issues
  2. Performance: 2D/3D operations are 3-8x slower than MATLAB due to Python overhead
  3. RNG Differences: Python and MATLAB RNGs produce different random numbers; use fixed seeds for reproducibility
  4. Grid Discretization: Small numerical differences (< 1e-3) due to grid-based methods

Differences from MATLAB

Random Number Generation

  • MATLAB: Uses MT19937 with rng(42)
  • Python: Uses PCG64 with np.random.seed(42)
  • Impact: Stochastic bandwidth selection may differ (algorithm is correct)

Performance

  • 1D: Python often faster than MATLAB
  • 2D/3D: Python 3-8x slower (pure Python overhead)
  • Scaling: Both O(N + M log M)

Numerical Precision

  • Maximum absolute error: < 0.05 across all tests
  • Mean prediction error: < 0.03
  • Bandwidth selection: Exact match for most tests

Examples

All examples are in examples/ and follow JSS publication standards:

cd examples
python example_fig2_fastkde.py           # KDE (1D, 2D, 3D)
python example_fig3_boundary_comparison.py  # NW vs LL vs LQ
python example_fig4_complex.py           # Complex-valued regression
python example_fig5_heteroscedasticity.py   # Heteroscedastic regression
python reproduce_all_figures.py          # Run all examples

Citation

If you use fastLPR in your research, please cite:

@article{wang2025fastlpr,
  title={fastLPR: Fast Local Polynomial Regression via NUFFT in MATLAB, Python, and R},
  author={Wang, Ying and Li, Min and Paz-Linares, Deirel and Valdes-Sosa, Pedro A.},
  journal={Journal of Statistical Software},
  year={2025},
  note={In preparation}
}

License

GPL-3.0 License. See LICENSE for details.

Authors

Links

Troubleshooting

Common Issues

ImportError: cannot import name 'cv_fastlpr'

pip install -e .  # Reinstall package

FFT Performance Warning

conda install numpy scipy mkl  # Install optimized FFT

Memory Error on Large 3D Datasets

opt = {'flag_power2': False}  # Reduce grid size
kde = cv_fastkde(x, hlist, opt)

Figure Not Displaying

import matplotlib
matplotlib.use('TkAgg')  # Set backend before importing pyplot

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fastlpr-1.0.0.tar.gz (357.0 kB view details)

Uploaded Source

File details

Details for the file fastlpr-1.0.0.tar.gz.

File metadata

  • Download URL: fastlpr-1.0.0.tar.gz
  • Upload date:
  • Size: 357.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.0

File hashes

Hashes for fastlpr-1.0.0.tar.gz
Algorithm Hash digest
SHA256 222dfc0e7645ce0b5152d6adf442cd0d8f4aef9d0339555585020a612a0267e4
MD5 508b22f3ddf20c166af753bb6d3525df
BLAKE2b-256 24c19957f10d7f70fb576f17cb87050fb7059c05769d8293096811f02ab50745

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page