Skip to main content

Implementation of PSplines in Python

Project description

PSplines - Penalized B-Spline Smoothing for Python

Python Version License

PSplines is a high-performance Python library for univariate penalized B-spline (P-spline) smoothing, implementing the methods described in Eilers & Marx (2021). It provides efficient sparse-matrix implementations with analytical uncertainty quantification, parametric bootstrap, and Bayesian inference capabilities.

Key Features

  • Fast Sparse Implementation: Uses SciPy sparse matrices and optimized solvers
  • Multiple Uncertainty Methods: Analytical (delta method), bootstrap, and Bayesian approaches
  • Flexible Configuration: Customizable basis functions, penalty orders, and constraints
  • Derivative Computation: Efficient computation of spline derivatives with uncertainty
  • Automatic Parameter Selection: Cross-validation, AIC, L-curve, and V-curve methods
  • Boundary Constraints: Support for derivative boundary conditions
  • Comprehensive Validation: Extensive input validation and error handling

Installation

Using pip

pip install psplines

Using uv (recommended for development)

git clone https://github.com/graysonbellamy/psplines.git
cd psplines
uv sync

From source

git clone https://github.com/graysonbellamy/psplines.git
cd psplines
pip install -e .

Quick Start

import numpy as np
import matplotlib.pyplot as plt
from psplines import PSpline

# Generate sample data
np.random.seed(42)
x = np.linspace(0, 2*np.pi, 100)
y = np.sin(x) + 0.1 * np.random.randn(100)

# Create and fit P-spline
spline = PSpline(x, y, nseg=20, lambda_=1.0)
spline.fit()

# Make predictions with uncertainty
x_new = np.linspace(0, 2*np.pi, 200)
y_pred, se = spline.predict(x_new, return_se=True)

# Plot results
plt.figure(figsize=(10, 6))
plt.scatter(x, y, alpha=0.5, label='Data')
plt.plot(x_new, y_pred, 'r-', label='P-spline fit')
plt.fill_between(x_new, y_pred - 1.96*se, y_pred + 1.96*se, 
                 alpha=0.3, label='95% CI')
plt.legend()
plt.show()

Core API

PSpline Class

The main class for penalized B-spline fitting:

spline = PSpline(
    x,                    # Input points (array-like)
    y,                    # Response values (array-like)  
    nseg=20,              # Number of B-spline segments
    degree=3,             # B-spline degree
    lambda_=10.0,         # Smoothing parameter
    penalty_order=2,      # Order of difference penalty
    constraints=None      # Boundary constraints (dict)
)

Key Methods

Fitting

# Basic fitting
spline.fit()

# With custom domain
spline.fit(xl=0, xr=10)

Prediction

# Basic prediction  
y_pred = spline.predict(x_new)

# With analytical standard errors
y_pred, se = spline.predict(x_new, return_se=True)

# With bootstrap standard errors
y_pred, se = spline.predict(x_new, return_se=True, se_method="bootstrap", n_boot=1000)

Derivatives

# First derivative
dy_dx = spline.derivative(x_new, deriv_order=1)

# Second derivative with uncertainty
d2y_dx2, se = spline.derivative(x_new, deriv_order=2, return_se=True)

Bayesian Inference

# Standard Bayesian P-spline (single λ, §3.5)
trace = spline.bayes_fit(draws=2000, tune=1000)

# Adaptive Bayesian P-spline (per-difference λ_j, §8.8)
trace = spline.bayes_fit(draws=2000, tune=1000, adaptive=True)

# Get posterior credible intervals (works with either mode)
mean, lower, upper = spline.predict(x_new, se_method="bayes", hdi_prob=0.95)

Parameter Selection

PSplines provides several methods for automatic smoothing parameter selection:

from psplines.optimize import cross_validation, aic, l_curve

# Cross-validation (recommended)
best_lambda, cv_score = cross_validation(spline)

# Akaike Information Criterion
best_lambda, aic_score = aic(spline)

# L-curve method
best_lambda, curvature = l_curve(spline)

# Use optimal parameter
spline.lambda_ = best_lambda
spline.fit()

Advanced Usage

Boundary Constraints

Enforce derivative constraints at boundaries:

# Zero first derivative at boundaries (natural spline)
constraints = {
    "deriv": {
        "order": 1,
        "initial": 0,
        "final": 0
    }
}

spline = PSpline(x, y, constraints=constraints)
spline.fit()

Different Penalty Orders

  • penalty_order=1: Penalizes differences (rough penalty on slopes)
  • penalty_order=2: Penalizes second differences (rough penalty on curvature)
  • penalty_order=3: Penalizes third differences (rough penalty on rate of curvature change)

Custom Smoothing

# Very smooth fit
smooth_spline = PSpline(x, y, lambda_=1000)

# More flexible fit  
flexible_spline = PSpline(x, y, lambda_=0.1)

# High-degree spline
high_deg_spline = PSpline(x, y, degree=5, nseg=30)

Performance Tips

  1. Sparse Operations: The library automatically uses sparse matrices for efficiency
  2. Vectorized Predictions: Predict on multiple points simultaneously
  3. Optimal nseg: Generally 10-50 segments work well; too many can cause overfitting
  4. Bootstrap Parallelization: Use n_jobs=-1 for parallel bootstrap computation
# Efficient batch prediction
y_pred = spline.predict(large_x_array)

# Parallel bootstrap
y_pred, se = spline.predict(x_new, return_se=True, se_method="bootstrap", 
                           n_boot=5000, n_jobs=-1)

Mathematical Background

PSplines combine B-spline basis functions with discrete difference penalties:

  • Basis: B-splines of degree d with nseg equally-spaced segments
  • Penalty: Discrete differences of order p (typically 1, 2, or 3)
  • Objective: Minimize ||y - Bα||² + λ||D_p α||² where B is the basis matrix and D_p is the difference matrix

The library implements efficient algorithms for:

  • Sparse matrix operations via SciPy
  • Analytical standard errors via the delta method
  • Effective degrees of freedom computation
  • Cross-validation and information criteria

Requirements

  • Python >= 3.10
  • NumPy >= 1.21
  • SciPy >= 1.7
  • PyMC >= 5.0 (for Bayesian methods)
  • Matplotlib >= 3.4 (for plotting utilities)
  • PyTensor >= 2.0 (for Bayesian methods)
  • ArviZ >= 0.12 (for Bayesian diagnostics)
  • Joblib >= 1.0 (for parallel processing)

Examples

Complete examples are available in the examples/ directory:

  • Basic Usage: Core functionality demonstration
  • Parameter Selection: Automatic lambda optimization
  • Uncertainty Methods: Comparison of different uncertainty approaches
  • Real-World Application: Time series analysis workflow

Contributing

Contributions are welcome! Please see our Contributing Guide for details.

Development Setup

git clone https://github.com/graysonbellamy/psplines.git
cd psplines
uv sync --dev

Running Tests

# Run all tests
uv run pytest

# Run with coverage
uv run pytest --cov=psplines --cov-report=html

# Run specific test module  
uv run pytest tests/test_core.py -v

License

This project is licensed under the MIT License - see the LICENSE file for details.

Citation

If you use PSplines in your research, please cite:

@software{bellamy2024psplines,
  author = {Bellamy, Grayson},
  title = {PSplines: Penalized B-Spline Smoothing for Python},
  year = {2024},
  url = {https://github.com/graysonbellamy/psplines}
}

References

  • Eilers, P. H. C., & Marx, B. D. (2021). Practical Smoothing: The Joys of P-splines. Cambridge University Press.
  • de Boor, C. (2001). A Practical Guide to Splines. Springer-Verlag.

Changelog

Version 0.1.3

  • Fixed dead code bug in derivative method
  • Added comprehensive input validation
  • Optimized diagonal computation for uncertainty
  • Enhanced error messages and documentation
  • Added extensive test suite

Version 0.1.2

  • Initial release with core P-spline functionality
  • Basic fitting, prediction, and derivative computation
  • Bayesian inference capabilities

Questions or Issues? Please open an issue on GitHub.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

psplines-0.2.3.tar.gz (365.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

psplines-0.2.3-py3-none-any.whl (42.4 kB view details)

Uploaded Python 3

File details

Details for the file psplines-0.2.3.tar.gz.

File metadata

  • Download URL: psplines-0.2.3.tar.gz
  • Upload date:
  • Size: 365.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.6 {"installer":{"name":"uv","version":"0.10.6","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Debian GNU/Linux","version":"13","id":"trixie","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for psplines-0.2.3.tar.gz
Algorithm Hash digest
SHA256 4a8016557ea853902d46a14e9d41c044eced5c4c79b0bdc4cde7f63fb89498b4
MD5 4f01203d3d395aeaa47b378d051344ec
BLAKE2b-256 037efcf57c57d7f62db99a1dac4786a1faab738eeb4038dd77eb06eeb85d8477

See more details on using hashes here.

File details

Details for the file psplines-0.2.3-py3-none-any.whl.

File metadata

  • Download URL: psplines-0.2.3-py3-none-any.whl
  • Upload date:
  • Size: 42.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.6 {"installer":{"name":"uv","version":"0.10.6","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Debian GNU/Linux","version":"13","id":"trixie","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for psplines-0.2.3-py3-none-any.whl
Algorithm Hash digest
SHA256 24955a900848717006de2fe70cddce82c5e7fc2df8a89d6afd1c2fa85db589a0
MD5 8eeae052474b01baedc3ae46503600ad
BLAKE2b-256 8cea3be31f16fd6783153f23a7ff0b9ed2445ae102c514b1d2d5c5585f0b0f56

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page