Skip to main content

Implementation of PSplines in Python

Project description

PSplines - Penalized B-Spline Smoothing for Python

Python Version License

PSplines is a high-performance Python library for univariate penalized B-spline (P-spline) smoothing, implementing the methods described in Eilers & Marx (2021). It provides efficient sparse-matrix implementations with analytical uncertainty quantification, parametric bootstrap, and Bayesian inference capabilities.

Key Features

  • Fast Sparse Implementation: Uses SciPy sparse matrices and optimized solvers
  • Multiple Uncertainty Methods: Analytical (delta method), bootstrap, and Bayesian approaches
  • Flexible Configuration: Customizable basis functions, penalty orders, and constraints
  • Derivative Computation: Efficient computation of spline derivatives with uncertainty
  • Automatic Parameter Selection: Cross-validation, AIC, L-curve, and V-curve methods
  • Boundary Constraints: Support for derivative boundary conditions
  • Comprehensive Validation: Extensive input validation and error handling

Installation

Using pip

pip install psplines

Using uv (recommended for development)

git clone https://github.com/graysonbellamy/psplines.git
cd psplines
uv sync

From source

git clone https://github.com/graysonbellamy/psplines.git
cd psplines
pip install -e .

Quick Start

import numpy as np
import matplotlib.pyplot as plt
from psplines import PSpline

# Generate sample data
np.random.seed(42)
x = np.linspace(0, 2*np.pi, 100)
y = np.sin(x) + 0.1 * np.random.randn(100)

# Create and fit P-spline
spline = PSpline(x, y, nseg=20, lambda_=1.0)
spline.fit()

# Make predictions with uncertainty
x_new = np.linspace(0, 2*np.pi, 200)
y_pred, se = spline.predict(x_new, return_se=True)

# Plot results
plt.figure(figsize=(10, 6))
plt.scatter(x, y, alpha=0.5, label='Data')
plt.plot(x_new, y_pred, 'r-', label='P-spline fit')
plt.fill_between(x_new, y_pred - 1.96*se, y_pred + 1.96*se, 
                 alpha=0.3, label='95% CI')
plt.legend()
plt.show()

Core API

PSpline Class

The main class for penalized B-spline fitting:

spline = PSpline(
    x,                    # Input points (array-like)
    y,                    # Response values (array-like)  
    nseg=20,              # Number of B-spline segments
    degree=3,             # B-spline degree
    lambda_=10.0,         # Smoothing parameter
    penalty_order=2,      # Order of difference penalty
    constraints=None      # Boundary constraints (dict)
)

Key Methods

Fitting

# Basic fitting
spline.fit()

# With custom domain
spline.fit(xl=0, xr=10)

Prediction

# Basic prediction  
y_pred = spline.predict(x_new)

# With analytical standard errors
y_pred, se = spline.predict(x_new, return_se=True)

# With bootstrap standard errors
y_pred, se = spline.predict(x_new, return_se=True, se_method="bootstrap", n_boot=1000)

Derivatives

# First derivative
dy_dx = spline.derivative(x_new, deriv_order=1)

# Second derivative with uncertainty
d2y_dx2, se = spline.derivative(x_new, deriv_order=2, return_se=True)

Bayesian Inference

# Standard Bayesian P-spline (single λ, §3.5)
trace = spline.bayes_fit(draws=2000, tune=1000)

# Adaptive Bayesian P-spline (per-difference λ_j, §8.8)
trace = spline.bayes_fit(draws=2000, tune=1000, adaptive=True)

# Get posterior credible intervals (works with either mode)
mean, lower, upper = spline.predict(x_new, se_method="bayes", hdi_prob=0.95)

Parameter Selection

PSplines provides several methods for automatic smoothing parameter selection:

from psplines.optimize import cross_validation, aic, l_curve

# Cross-validation (recommended)
best_lambda, cv_score = cross_validation(spline)

# Akaike Information Criterion
best_lambda, aic_score = aic(spline)

# L-curve method
best_lambda, curvature = l_curve(spline)

# Use optimal parameter
spline.lambda_ = best_lambda
spline.fit()

Advanced Usage

Boundary Constraints

Enforce derivative constraints at boundaries:

# Zero first derivative at boundaries (natural spline)
constraints = {
    "deriv": {
        "order": 1,
        "initial": 0,
        "final": 0
    }
}

spline = PSpline(x, y, constraints=constraints)
spline.fit()

Different Penalty Orders

  • penalty_order=1: Penalizes differences (rough penalty on slopes)
  • penalty_order=2: Penalizes second differences (rough penalty on curvature)
  • penalty_order=3: Penalizes third differences (rough penalty on rate of curvature change)

Custom Smoothing

# Very smooth fit
smooth_spline = PSpline(x, y, lambda_=1000)

# More flexible fit  
flexible_spline = PSpline(x, y, lambda_=0.1)

# High-degree spline
high_deg_spline = PSpline(x, y, degree=5, nseg=30)

Performance Tips

  1. Sparse Operations: The library automatically uses sparse matrices for efficiency
  2. Vectorized Predictions: Predict on multiple points simultaneously
  3. Optimal nseg: Generally 10-50 segments work well; too many can cause overfitting
  4. Bootstrap Parallelization: Use n_jobs=-1 for parallel bootstrap computation
# Efficient batch prediction
y_pred = spline.predict(large_x_array)

# Parallel bootstrap
y_pred, se = spline.predict(x_new, return_se=True, se_method="bootstrap", 
                           n_boot=5000, n_jobs=-1)

Mathematical Background

PSplines combine B-spline basis functions with discrete difference penalties:

  • Basis: B-splines of degree d with nseg equally-spaced segments
  • Penalty: Discrete differences of order p (typically 1, 2, or 3)
  • Objective: Minimize ||y - Bα||² + λ||D_p α||² where B is the basis matrix and D_p is the difference matrix

The library implements efficient algorithms for:

  • Sparse matrix operations via SciPy
  • Analytical standard errors via the delta method
  • Effective degrees of freedom computation
  • Cross-validation and information criteria

Requirements

  • Python >= 3.10
  • NumPy >= 1.21
  • SciPy >= 1.7
  • PyMC >= 5.0 (for Bayesian methods)
  • Matplotlib >= 3.4 (for plotting utilities)
  • PyTensor >= 2.0 (for Bayesian methods)
  • ArviZ >= 0.12 (for Bayesian diagnostics)
  • Joblib >= 1.0 (for parallel processing)

Examples

Complete examples are available in the examples/ directory:

  • Basic Usage: Core functionality demonstration
  • Parameter Selection: Automatic lambda optimization
  • Uncertainty Methods: Comparison of different uncertainty approaches
  • Real-World Application: Time series analysis workflow

Contributing

Contributions are welcome! Please see our Contributing Guide for details.

Development Setup

git clone https://github.com/graysonbellamy/psplines.git
cd psplines
uv sync --dev

Running Tests

# Run all tests
uv run pytest

# Run with coverage
uv run pytest --cov=psplines --cov-report=html

# Run specific test module  
uv run pytest tests/test_core.py -v

License

This project is licensed under the MIT License - see the LICENSE file for details.

Citation

If you use PSplines in your research, please cite:

@software{bellamy2024psplines,
  author = {Bellamy, Grayson},
  title = {PSplines: Penalized B-Spline Smoothing for Python},
  year = {2024},
  url = {https://github.com/graysonbellamy/psplines}
}

References

  • Eilers, P. H. C., & Marx, B. D. (2021). Practical Smoothing: The Joys of P-splines. Cambridge University Press.
  • de Boor, C. (2001). A Practical Guide to Splines. Springer-Verlag.

Changelog

Version 0.1.3

  • Fixed dead code bug in derivative method
  • Added comprehensive input validation
  • Optimized diagonal computation for uncertainty
  • Enhanced error messages and documentation
  • Added extensive test suite

Version 0.1.2

  • Initial release with core P-spline functionality
  • Basic fitting, prediction, and derivative computation
  • Bayesian inference capabilities

Questions or Issues? Please open an issue on GitHub.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

psplines-0.2.1.tar.gz (329.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

psplines-0.2.1-py3-none-any.whl (37.3 kB view details)

Uploaded Python 3

File details

Details for the file psplines-0.2.1.tar.gz.

File metadata

  • Download URL: psplines-0.2.1.tar.gz
  • Upload date:
  • Size: 329.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.6 {"installer":{"name":"uv","version":"0.10.6","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Debian GNU/Linux","version":"13","id":"trixie","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for psplines-0.2.1.tar.gz
Algorithm Hash digest
SHA256 05d72bc0389a94f9289f12b4fa0e5852e07f170f446c4246cdf5b3a6a1a548b5
MD5 3a1714256743ddf8781c361d8e67480e
BLAKE2b-256 21658132a73ec23230abd19476278c142eb9ab2a39d0c8106cee178ad0ca31da

See more details on using hashes here.

File details

Details for the file psplines-0.2.1-py3-none-any.whl.

File metadata

  • Download URL: psplines-0.2.1-py3-none-any.whl
  • Upload date:
  • Size: 37.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.6 {"installer":{"name":"uv","version":"0.10.6","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Debian GNU/Linux","version":"13","id":"trixie","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for psplines-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 28885823ea7cd38d4025267d3f8ebff060b738ebff360af346370bff8eaeee61
MD5 d05a8d9c2f71a54c6edf71734173ff78
BLAKE2b-256 a7ec7c45f22c39da16125c6a5199daa4a3f87e303c628ed989d0a896b1a832db

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page