Skip to main content

Tools for fast and robust univariate and multivariate kernel density estimation

Project description

PyPI version GitHub Workflow Status (with event) Open In Colab

fastKDE

Software Overview

fastKDE calculates a kernel density estimate of arbitrarily dimensioned data; it does so rapidly and robustly using recently developed KDE techniques. It does so with statistical skill that is as good as state-of-the-science 'R' KDE packages, and it does so 10,000 times faster for bivariate data (even better improvements for higher dimensionality).

Please cite the following papers when using this method:

  • O’Brien, T. A., Kashinath, K., Cavanaugh, N. R., Collins, W. D. & O’Brien, J. P. A fast and objective multidimensional kernel density estimation method: fastKDE. Comput. Stat. Data Anal. 101, 148–160 (2016). http://dx.doi.org/10.1016/j.csda.2016.02.014
  • O’Brien, T. A., Collins, W. D., Rauscher, S. A. & Ringler, T. D. Reducing the computational cost of the ECF using a nuFFT: A fast and objective probability density estimation method. Comput. Stat. Data Anal. 79, 222–234 (2014). http://dx.doi.org/10.1016/j.csda.2014.06.002

Example usage:

For a standard PDF

""" Demonstrate the first README example. """
import numpy as np
import fastkde
import matplotlib.pyplot as plt

#Generate two random variables dataset (representing 100,000 pairs of datapoints)
N = int(1e5)
x = 50*np.random.normal(size=N) + 0.1
y = 0.01*np.random.normal(size=N) - 300

#Do the self-consistent density estimate
PDF = fastkde.pdf(x, y, var_names = ['x', 'y'])

PDF.plot();

For a conditional PDF

The following code generates samples from a non-trivial joint distribution

#***************************
# Generate random samples
#***************************
# Stochastically sample from the function underlyingFunction() (a sigmoid):
# sample the absicissa values from a gamma distribution
# relate the ordinate values to the sample absicissa values and add
# noise from a normal distribution

#Set the number of samples
numSamples = int(1e6)

#Define a sigmoid function
def underlyingFunction(x,x0=305,y0=200,yrange=4):
        return (yrange/2)*np.tanh(x-x0) + y0

xp1,xp2,xmid = 5,2,305  #Set gamma distribution parameters
yp1,yp2 = 0,12          #Set normal distribution parameters (mean and std)

#Generate random samples of X from the gamma distribution
x = -(np.random.gamma(xp1,xp2,int(numSamples))-xp1*xp2) + xmid
#Generate random samples of y from x and add normally distributed noise
y = underlyingFunction(x) + np.random.normal(loc=yp1,scale=yp2,size=numSamples)

Now that we have the x,y samples, the following code calculates the conditional

#***************************
# Calculate the conditional
#***************************
cPDF = fastkde.conditional(y, x, var_names = ['y', 'x'])

The following plot shows the results:

#***************************
# Plot the conditional
#***************************
fig,axs = plt.subplots(1,2,figsize=(10,5), sharex=True, sharey=True)

#Plot a scatter plot of the incoming data
axs[0].plot(x,y,'k.',alpha=0.1)
axs[0].set_title('Original (x,y) data')
axs[0].set_xlabel('x')
axs[0].set_ylabel('y')

#Draw a contour plot of the conditional
cPDF.plot(ax = axs[1], add_colorbar = False)
#Overplot the original underlying relationship
axs[1].plot(cPDF.x,underlyingFunction(cPDF.x),linewidth=3,linestyle='--',alpha=0.5)
axs[1].set_title('P(y|x)')

plt.savefig('conditional_demo.png')
plt.show()

Image of conditional distribution demonstration

Kernel Density Estimate for Specific Points

To see the KDE values at specified points (not necessarily those that were used to generate the KDE):

""" Demonstrate using the pdf_at_points function. """""
import fastkde
train_x = 50*np.random.normal(size=100) + 0.1
train_y = 0.01*np.random.normal(size=100) - 300

test_x = 50*np.random.normal(size=100) + 0.1
test_y = 0.01*np.random.normal(size=100) - 300

test_points = list(zip(test_x, test_y))
test_point_pdf_values = fastkde.pdf_at_points(train_x, train_y, list_of_points = test_points)

Note that this method can be significantly slower than calls to fastkde.pdf() since it does not benefit from using a fast Fourier transform during the final stage in which the PDF estimate is transformed from spectral space into data space, whereas fastkde.pdf() does.

How do I get set up?

python -m pip install fastkde

Copyright Information

See LICENSE.txt

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fastkde-2.0.0.tar.gz (1.3 MB view hashes)

Uploaded Source

Built Distributions

fastkde-2.0.0-cp312-cp312-win_amd64.whl (592.0 kB view hashes)

Uploaded CPython 3.12 Windows x86-64

fastkde-2.0.0-cp312-cp312-musllinux_1_1_x86_64.whl (1.7 MB view hashes)

Uploaded CPython 3.12 musllinux: musl 1.1+ x86-64

fastkde-2.0.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.7 MB view hashes)

Uploaded CPython 3.12 manylinux: glibc 2.17+ x86-64

fastkde-2.0.0-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.7 MB view hashes)

Uploaded CPython 3.12 manylinux: glibc 2.17+ ARM64

fastkde-2.0.0-cp312-cp312-macosx_10_9_x86_64.whl (623.1 kB view hashes)

Uploaded CPython 3.12 macOS 10.9+ x86-64

fastkde-2.0.0-cp312-cp312-macosx_10_9_universal2.whl (830.6 kB view hashes)

Uploaded CPython 3.12 macOS 10.9+ universal2 (ARM64, x86-64)

fastkde-2.0.0-cp311-cp311-win_amd64.whl (594.0 kB view hashes)

Uploaded CPython 3.11 Windows x86-64

fastkde-2.0.0-cp311-cp311-musllinux_1_1_x86_64.whl (1.7 MB view hashes)

Uploaded CPython 3.11 musllinux: musl 1.1+ x86-64

fastkde-2.0.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.8 MB view hashes)

Uploaded CPython 3.11 manylinux: glibc 2.17+ x86-64

fastkde-2.0.0-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.7 MB view hashes)

Uploaded CPython 3.11 manylinux: glibc 2.17+ ARM64

fastkde-2.0.0-cp311-cp311-macosx_10_9_x86_64.whl (627.8 kB view hashes)

Uploaded CPython 3.11 macOS 10.9+ x86-64

fastkde-2.0.0-cp311-cp311-macosx_10_9_universal2.whl (837.2 kB view hashes)

Uploaded CPython 3.11 macOS 10.9+ universal2 (ARM64, x86-64)

fastkde-2.0.0-cp310-cp310-win_amd64.whl (594.1 kB view hashes)

Uploaded CPython 3.10 Windows x86-64

fastkde-2.0.0-cp310-cp310-musllinux_1_1_x86_64.whl (1.6 MB view hashes)

Uploaded CPython 3.10 musllinux: musl 1.1+ x86-64

fastkde-2.0.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.7 MB view hashes)

Uploaded CPython 3.10 manylinux: glibc 2.17+ x86-64

fastkde-2.0.0-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.6 MB view hashes)

Uploaded CPython 3.10 manylinux: glibc 2.17+ ARM64

fastkde-2.0.0-cp310-cp310-macosx_10_9_x86_64.whl (628.8 kB view hashes)

Uploaded CPython 3.10 macOS 10.9+ x86-64

fastkde-2.0.0-cp310-cp310-macosx_10_9_universal2.whl (838.9 kB view hashes)

Uploaded CPython 3.10 macOS 10.9+ universal2 (ARM64, x86-64)

fastkde-2.0.0-cp39-cp39-win_amd64.whl (594.9 kB view hashes)

Uploaded CPython 3.9 Windows x86-64

fastkde-2.0.0-cp39-cp39-musllinux_1_1_x86_64.whl (1.6 MB view hashes)

Uploaded CPython 3.9 musllinux: musl 1.1+ x86-64

fastkde-2.0.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.7 MB view hashes)

Uploaded CPython 3.9 manylinux: glibc 2.17+ x86-64

fastkde-2.0.0-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.6 MB view hashes)

Uploaded CPython 3.9 manylinux: glibc 2.17+ ARM64

fastkde-2.0.0-cp39-cp39-macosx_10_9_x86_64.whl (629.8 kB view hashes)

Uploaded CPython 3.9 macOS 10.9+ x86-64

fastkde-2.0.0-cp39-cp39-macosx_10_9_universal2.whl (840.6 kB view hashes)

Uploaded CPython 3.9 macOS 10.9+ universal2 (ARM64, x86-64)

fastkde-2.0.0-cp38-cp38-win_amd64.whl (596.3 kB view hashes)

Uploaded CPython 3.8 Windows x86-64

fastkde-2.0.0-cp38-cp38-musllinux_1_1_x86_64.whl (1.7 MB view hashes)

Uploaded CPython 3.8 musllinux: musl 1.1+ x86-64

fastkde-2.0.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.7 MB view hashes)

Uploaded CPython 3.8 manylinux: glibc 2.17+ x86-64

fastkde-2.0.0-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.7 MB view hashes)

Uploaded CPython 3.8 manylinux: glibc 2.17+ ARM64

fastkde-2.0.0-cp38-cp38-macosx_10_9_x86_64.whl (626.5 kB view hashes)

Uploaded CPython 3.8 macOS 10.9+ x86-64

fastkde-2.0.0-cp38-cp38-macosx_10_9_universal2.whl (834.6 kB view hashes)

Uploaded CPython 3.8 macOS 10.9+ universal2 (ARM64, x86-64)

fastkde-2.0.0-cp37-cp37m-win_amd64.whl (591.6 kB view hashes)

Uploaded CPython 3.7m Windows x86-64

fastkde-2.0.0-cp37-cp37m-musllinux_1_1_x86_64.whl (1.6 MB view hashes)

Uploaded CPython 3.7m musllinux: musl 1.1+ x86-64

fastkde-2.0.0-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.6 MB view hashes)

Uploaded CPython 3.7m manylinux: glibc 2.17+ x86-64

fastkde-2.0.0-cp37-cp37m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.5 MB view hashes)

Uploaded CPython 3.7m manylinux: glibc 2.17+ ARM64

fastkde-2.0.0-cp37-cp37m-macosx_10_9_x86_64.whl (625.5 kB view hashes)

Uploaded CPython 3.7m macOS 10.9+ x86-64

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page