Skip to main content

pynumdiff: numerical derivatives in python

Project description

PyNumDiff

Python methods for numerical differentiation of noisy data, including multi-objective optimization routines for automated parameter selection.

Python for Numerical Differentiation of noisy time series data

Documentation Status Coverage Status PyPI DOI

Introduction

PyNumDiff is a Python package that implements various methods for computing numerical derivatives of noisy data, which can be a critical step in developing dynamic models or designing control. There are seven different families of methods implemented in this repository:

  1. prefiltering followed by finite difference calculation
  2. iterated finite differencing
  3. polynomial fit methods
  4. basis function fit methods
  5. total variation regularization of a finite difference derivative
  6. generalized Kalman smoothing
  7. local approximation with linear model

For a full list, explore modules in the Sphinx documentation.

Most of these methods have multiple parameters, so we take a principled approach and propose a multi-objective optimization framework for choosing parameters that minimize a loss function to balance the faithfulness and smoothness of the derivative estimate. For more details, refer to this paper.

Installing

Dependencies are listed in pyproject.toml. They include the usual suspects like numpy and scipy, but also optionally cvxpy.

The code is compatible with >=Python 3.10. Install from PyPI with pip install pynumdiff, from source with pip install git+https://github.com/florisvb/PyNumDiff, or from local download with pip install .. Call pip install pynumdiff[advanced] to automatically install optional dependencies from the advanced list, like CVXPY.

Usage

For more details, read our Sphinx documentation. The basic pattern of all differentiation methods is:

somethingdiff(x, dt, **kwargs)

where x is data, dt is a step size, and various keyword arguments control the behavior. Some methods support variable step size, in which case the second parameter is renamed dt_or_t and can receive either a constant step size or an array of values to denote sample locations. Some methods support multidimensional data, in which case there is an axis argument to control the dimension differentiated along.

You can set the hyperparameters:

from pynumdiff.submodule import method

x_hat, dxdt_hat = method(x, dt, param1=val1, param2=val2, ...)     

Or you can find hyperparameter settings by calling the multi-objective optimization algorithm from the optimize module:

from pynumdiff.optimize import optimize

# estimate cutoff_frequency by (a) counting the number of true peaks per second in the data or (b) look at power spectra and choose cutoff
tvgamma = np.exp(-1.6*np.log(cutoff_frequency) -0.71*np.log(dt) - 5.1) # see https://ieeexplore.ieee.org/abstract/document/9241009

params, val = optimize(somethingdiff, x, dt, tvgamma=tvgamma, # smoothness hyperparameter which defaults to None if dxdt_truth given
            dxdt_truth=None, # give ground truth data if available, in which case tvgamma goes unused
            search_space_updates={'param1':[vals], 'param2':[vals], ...})

print('Optimal parameters: ', params)
x_hat, dxdt_hat = somethingdiff(x, dt, **params)

If no search_space_updates is given, a default search space is used. See the top of optimize.py.

The following heuristic works well for choosing tvgamma, where cutoff_frequency is the highest frequency content of the signal in your data, and dt is the timestep: tvgamma=np.exp(-1.6*np.log(cutoff_frequency)-0.71*np.log(dt)-5.1). Larger values of tvgamma produce smoother derivatives. The value of tvgamma is largely universal across methods, making it easy to compare method results. Be aware the optimization is a fairly heavy process.

Notebook examples

Much more extensive usage is demonstrated in Jupyter notebooks:

See the README in the notebooks/ folder for a full guide to all demos and experiments.

Repo Structure

  • .github/workflows contains .yaml that configures our GitHub Actions continuous integration (CI) runs.
  • docs/ contains make files and .rst files to govern the way sphinx builds documentation, either locally by navigating to this folder and calling make html or in the cloud by readthedocs.io.
  • notebooks/ contains Jupyter notebooks that demonstrate some usage of the library.
  • pynumdiff/ contains the source code. For a full list of modules and further navigation help, see the readme in this subfolder.
  • .coveragerc governs coverage runs, listing files and functions/lines that should be excluded, e.g. plotting code.
  • .editorconfig ensures tabs are displayed as 4 characters wide.
  • .gitignore ensures files generated by local pip installs, Jupyter notebook runs, caches from code runs, virtual environments, and more are not picked up by git and accidentally added to the repo.
  • .pylintrc configures pylint, a tool for autochecking code quality.
  • .readthedocs.yaml configures readthedocs and is necessary for documentation to get auto-rebuilt.
  • CITATION.cff is citation information for the Journal of Open-Source Software (JOSS) paper associated with this project.
  • LICENSE.txt allows free usage of this project.
  • README.md is the text you're reading, hello.
  • pyproject.toml governs how this package is set up and installed, including dependencies.

Citation

See CITATION.cff file as well as the following references.

PyNumDiff python package:

@article{PyNumDiff2022,
  doi = {10.21105/joss.04078},
  url = {https://doi.org/10.21105/joss.04078},
  year = {2022},
  publisher = {The Open Journal},
  volume = {7},
  number = {71},
  pages = {4078},
  author = {Floris van Breugel and Yuying Liu and Bingni W. Brunton and J. Nathan Kutz},
  title = {PyNumDiff: A Python package for numerical differentiation of noisy time-series data},
  journal = {Journal of Open Source Software}
}

Optimization algorithm:

@article{ParamOptimizationDerivatives2020, 
doi={10.1109/ACCESS.2020.3034077}
author={F. {van Breugel} and J. {Nathan Kutz} and B. W. {Brunton}}, 
journal={IEEE Access}, 
title={Numerical differentiation of noisy data: A unifying multi-objective optimization framework}, 
year={2020}
}

Running the tests

We are using GitHub Actions for continuous intergration testing.

Run tests locally by navigating to the repo in a terminal and calling

> pytest -s

Add the flag --plot to see plots of the methods against test functions. Add the flag --bounds to print $\log$ error bounds (useful when changing method behavior).

License

This project utilizes the MIT LICENSE. 100% open-source, feel free to utilize the code however you like.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pynumdiff-0.2.1.tar.gz (11.9 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pynumdiff-0.2.1-py3-none-any.whl (63.1 kB view details)

Uploaded Python 3

File details

Details for the file pynumdiff-0.2.1.tar.gz.

File metadata

  • Download URL: pynumdiff-0.2.1.tar.gz
  • Upload date:
  • Size: 11.9 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.0

File hashes

Hashes for pynumdiff-0.2.1.tar.gz
Algorithm Hash digest
SHA256 903a946756b4e1abc358f797b907d8b700bae36ac427b4dc90e39aa4f12b7673
MD5 7c72a8a2293c19f602aa2b2b7a0fe40b
BLAKE2b-256 82ee4144a2a050cfb9863fac4c5ef9ce9ffa90e784e25da99bcdf97dabb42b3e

See more details on using hashes here.

File details

Details for the file pynumdiff-0.2.1-py3-none-any.whl.

File metadata

  • Download URL: pynumdiff-0.2.1-py3-none-any.whl
  • Upload date:
  • Size: 63.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.0

File hashes

Hashes for pynumdiff-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 a20f3972c2660fd7765450f9bb2d2692d6f7e997aaf87cabf867b033f6d7dec5
MD5 382699449419bb74cc08ad12b30a47f7
BLAKE2b-256 82ee76539ea1422d8356d14e5dfd72a2a033ee305cb5e60d8c1b68c29ac8d7e6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page