Skip to main content

Tools for fast and robust univariate and multivariate kernel density estimation

Project description

Software Overview

fastKDE calculates a kernel density estimate of arbitrarily dimensioned data; it does so rapidly and robustly using recently developed KDE techniques. It does so with statistical skill that is as good as state-of-the-science ‘R’ KDE packages, and it does so 10,000 times faster for bivariate data (even better improvements for higher dimensionality).

Please cite the following papers when using this method:

O’Brien, T. A., Kashinath, K., Cavanaugh, N. R., Collins, W. D. & O’Brien, J. P. A fast and objective multidimensional kernel density estimation method: fastKDE. Comput. Stat. Data Anal. 101, 148–160 (2016).

O’Brien, T. A., Collins, W. D., Rauscher, S. A. & Ringler, T. D. Reducing the computational cost of the ECF using a nuFFT: A fast and objective probability density estimation method. Comput. Stat. Data Anal. 79, 222–234 (2014).

Example usage:

For a standard PDF

import numpy as np
from fastkde import fastKDE
import pylab as PP

#Generate two random variables dataset (representing 100000 pairs of datapoints)
N = 2e5
var1 = 50*np.random.normal(size=N) + 0.1
var2 = 0.01*np.random.normal(size=N) - 300

#Do the self-consistent density estimate
myPDF,axes = fastKDE.pdf(var1,var2)

#Extract the axes from the axis list
v1,v2 = axes

#Plot contours of the PDF should be a set of concentric ellipsoids centered on
#(0.1, -300) Comparitively, the y axis range should be tiny and the x axis range
#should be large
PP.contour(v1,v2,myPDF)
PP.show()

For a conditional PDF

The following code generates samples from a non-trivial joint distribution

from fastkde import fastKDE
import pylab as PP
import numpy as np

#***************************
# Generate random samples
#***************************
# Stochastically sample from the function underlyingFunction() (a sigmoid):
# sample the absicissa values from a gamma distribution
# relate the ordinate values to the sample absicissa values and add
# noise from a normal distribution

#Set the number of samples
numSamples = int(1e6)

#Define a sigmoid function
def underlyingFunction(x,x0=305,y0=200,yrange=4):
     return (yrange/2)*np.tanh(x-x0) + y0

xp1,xp2,xmid = 5,2,305  #Set gamma distribution parameters
yp1,yp2 = 0,12          #Set normal distribution parameters (mean and std)

#Generate random samples of X from the gamma distribution
x = -(np.random.gamma(xp1,xp2,int(numSamples))-xp1*xp2) + xmid
#Generate random samples of y from x and add normally distributed noise
y = underlyingFunction(x) + np.random.normal(loc=yp1,scale=yp2,size=numSamples)

Now that we have the x,y samples, the following code calculates the conditional

#***************************
# Calculate the conditional
#***************************
pOfYGivenX,axes = fastKDE.conditional(y,x)

The following plot shows the results:

#***************************
# Plot the conditional
#***************************
fig,axs = PP.subplots(1,2,figsize=(10,5))

#Plot a scatter plot of the incoming data
axs[0].plot(x,y,'k.',alpha=0.1)
axs[0].set_title('Original (x,y) data')

#Set axis labels
for i in (0,1):
    axs[i].set_xlabel('x')
    axs[i].set_ylabel('y')

#Draw a contour plot of the conditional
axs[1].contourf(axes[0],axes[1],pOfYGivenX,64)
#Overplot the original underlying relationship
axs[1].plot(axes[0],underlyingFunction(axes[0]),linewidth=3,linestyle='--',alpha=0.5)
axs[1].set_title('P(y|x)')

#Set axis limits to be the same
xlim = [np.amin(axes[0]),np.amax(axes[0])]
ylim = [np.amin(axes[1]),np.amax(axes[1])]
axs[1].set_xlim(xlim)
axs[1].set_ylim(ylim)
axs[0].set_xlim(xlim)
axs[0].set_ylim(ylim)

fig.tight_layout()

PP.savefig('conditional_demo.png')
PP.show()
Conditional PDF

Conditional PDF

Kernel Density Estimate for Specific Points

To see the KDE values at specified points (not necessarily those that were used to generate the KDE):

import numpy as np
from fastkde import fastKDE

train_x = 50*np.random.normal(size=100) + 0.1
train_y = 0.01*np.random.normal(size=100) - 300

test_x = 50*np.random.normal(size=100) + 0.1
test_y = 0.01*np.random.normal(size=100) - 300

test_points = list(zip(test_x, test_y))
test_point_pdf_values = fastKDE.pdf_at_points(train_x, train_y, list_of_points = test_points)

Note that this method can be significantly slower than calls to fastkde.pdf() since it does not benefit from using a fast Fourier transform during the final stage in which the PDF estimate is transformed from spectral space into data space, whereas fastkde.pdf() does.

How do I get set up?

A standard python build: python setup.py install

or

pip install fastkde

Download the source

Please contact Travis A. O’Brien TAOBrien@lbl.gov to obtain the latest version of the source.

Install pre-requisites

This code requires the following software:

  • Python >= 2.7.3

  • Numpy >= 1.7

  • scipy

  • cython

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

fastkde-0.0.0-cp36-cp36m-win_amd64.whl (218.9 kB view details)

Uploaded CPython 3.6mWindows x86-64

fastkde-0.0.0-cp36-cp36m-manylinux2014_aarch64.whl (1.1 MB view details)

Uploaded CPython 3.6m

fastkde-0.0.0-cp36-cp36m-manylinux2010_x86_64.whl (964.0 kB view details)

Uploaded CPython 3.6mmanylinux: glibc 2.12+ x86-64

fastkde-0.0.0-cp36-cp36m-manylinux2010_i686.whl (912.1 kB view details)

Uploaded CPython 3.6mmanylinux: glibc 2.12+ i686

fastkde-0.0.0-cp36-cp36m-manylinux1_x86_64.whl (1.0 MB view details)

Uploaded CPython 3.6m

fastkde-0.0.0-cp36-cp36m-manylinux1_i686.whl (977.4 kB view details)

Uploaded CPython 3.6m

fastkde-0.0.0-cp36-cp36m-macosx_10_9_x86_64.whl (231.7 kB view details)

Uploaded CPython 3.6mmacOS 10.9+ x86-64

File details

Details for the file fastkde-0.0.0-cp36-cp36m-win_amd64.whl.

File metadata

  • Download URL: fastkde-0.0.0-cp36-cp36m-win_amd64.whl
  • Upload date:
  • Size: 218.9 kB
  • Tags: CPython 3.6m, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.16

File hashes

Hashes for fastkde-0.0.0-cp36-cp36m-win_amd64.whl
Algorithm Hash digest
SHA256 7c04b555b2786c3f2bccb1f5cc886d7752d9eed71aad7bc2697d660842bfcd66
MD5 142bc6c89302a64158cb274e8aa2bdc3
BLAKE2b-256 63b719b6e8c755301c88c17a9f5e64ba6b5718c4697e52ddca47c6ab1f09c840

See more details on using hashes here.

File details

Details for the file fastkde-0.0.0-cp36-cp36m-manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for fastkde-0.0.0-cp36-cp36m-manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 802a23766693de4fe63eaff06cb0b2bbd4938f6cc5983a44db8b27a7a1fd045f
MD5 659bf264ad4f82e2c3d9ac87890626ac
BLAKE2b-256 d920ae9dfdc75f11679e1902c751bcffd5a3397b89fb6bf0a8ead40850c3d51d

See more details on using hashes here.

File details

Details for the file fastkde-0.0.0-cp36-cp36m-manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for fastkde-0.0.0-cp36-cp36m-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 414b0f38ead8cf39ddb81b5456f1fe7b3a38a268b4a683259d68b5c8205b1903
MD5 677be4462c7830fd20f5c9a5114b38a9
BLAKE2b-256 8376220c42ae70758799619b29d53a8ed87decfc16379bd3a5f75148cc0c5439

See more details on using hashes here.

File details

Details for the file fastkde-0.0.0-cp36-cp36m-manylinux2010_i686.whl.

File metadata

File hashes

Hashes for fastkde-0.0.0-cp36-cp36m-manylinux2010_i686.whl
Algorithm Hash digest
SHA256 1bc2eee7070bac5256156799453ff28d2f44cd89878d6b412f78acd717ffcb98
MD5 bbfbfa96a9cdd4398ac9ca3fad77bdc0
BLAKE2b-256 4edc119f7a2c1ce6c8b55788fdb79961b72213f73b9251dbb2a75f74bfc523fe

See more details on using hashes here.

File details

Details for the file fastkde-0.0.0-cp36-cp36m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for fastkde-0.0.0-cp36-cp36m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 2918cc3923df35b1daa14771b17841a371e73af25f9a3d397a36b19c0f724fbf
MD5 2f99f3e8ef554932a85e415e88f7cbb3
BLAKE2b-256 303bda0fa9f7bb8df1914530a70ffdff3333354921ac9afc8ad7ccb0f35f5e50

See more details on using hashes here.

File details

Details for the file fastkde-0.0.0-cp36-cp36m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for fastkde-0.0.0-cp36-cp36m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 5498afad89f0dd617f11845079e794689aa8b07579ef08e8e61d9d965c197153
MD5 d84f494eae89ede77d39803a91da9605
BLAKE2b-256 20a9d6be9ff4802ec94b8af02e950bd9244306fd8252a1da9b73ad18d31193a0

See more details on using hashes here.

File details

Details for the file fastkde-0.0.0-cp36-cp36m-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for fastkde-0.0.0-cp36-cp36m-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 1d43bce4af0aa07e1382b7b9c5879a78a21ae6b682be2aa4197d11a101efb5ee
MD5 4152ec2eb19131a63a83e81034d8e8bc
BLAKE2b-256 558eec41b1ef684dd45b080e3bec45a768489e92f308af4ad6123ef9dc473392

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page