Skip to main content

Tools for finding gaps and valleys in data distribution with a twice-differentiable density estimator with finite support.

Project description

FindTheGap

This package (also unformally known as "Gappy") provides tools for geometric data analysis, targeted at finding gaps and valleys in data distribution. It provides a (twice-differentiable) density estimator (Quartic Kernel Density Estimator) relying on pytorch for auto-differentiation, a routine to approximate critical points in the density ,and various statistics to identify and trace `gaps' and valleys in the distribution.

This package can be installed through pip (https://pypi.org/project/findthegap/):

pip install findthegap 

See https://github.com/contardog/findthegap for demo and usecase notebook in the folder 'examples'.

Notebook requirements: sklearn, matplotlib

The folder 'examples' contains a notebook showcasing how to use those tools on 2D data (available in the folder data).

Disclaimer: this code is work in progress and might go through some changes especially for higher (>2!) dimension...

Contributors: Gabriella Contardo (CCA at Simons Foundation), David W. Hogg(CCA/NYU/MPIA), Jason S.A. Hunt (CCA)

You can find more information about the methods in the paper "The emptiness inside: Finding gaps, valleys, and lacunae with geometric data analysis" https://arxiv.org/abs/2201.10674


Dependencies:

  • numpy >= 1.19.5

  • torch >= 1.10.1

  • scipy >= 1.5.4

Update version 0.0.5: fix path computation with gradient descent.

Update version 0.0.6: changed score_samples in quarticKDE for speed and memory.

Update version 0.0.7: Fix bug in changed score_samples

Update version 0.0.8: Make the description reappear?!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

findthegap-0.0.8.tar.gz (22.3 kB view hashes)

Uploaded Source

Built Distribution

findthegap-0.0.8-py3-none-any.whl (22.4 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page