Skip to main content

Distance-based Analysis of DAta-manifolds in python

Project description

Code style: black Imports: isort codecov GitHub Workflow Status GitHub Workflow Status GitHub Workflow Status

DADApy is a Python package for the characterisation of manifolds in high dimensional spaces.

Homepage

For more details and tutorials, visit the homepage at: https://dadapy.readthedocs.io/

Quick Example

import numpy as np
from dadapy.data import Data

# Generate a simple 3D gaussian dataset
X = np.random.normal(0, 1, (1000, 3))

# initialise the "Data" class with the set of coordinates
data = Data(X)

# compute distances up to the 100th nearest neighbour
data.compute_distances(maxk=100)

# compute the intrinsic dimension using 2nn estimator
data.compute_id_2NN()

# compute the density using PAk, a point adaptive kNN estimator
data.compute_density_PAk()

# find the peaks of the density profile through the ADP algorithm
data.compute_clustering_ADP()

Currently implemented algorithms

  • Intrinsic dimension estimators

  • Two-NN estimator

    Facco et al., Scientific Reports (2017)

  • Gride estimator

    Denti et al., Scientific Reports (2022)

  • Density estimators

  • kNN estimator

  • k*NN estimator (kNN with adaptive choice of k)

  • PAk estimator

    Rodriguez et al., JCTC (2018)

  • Density peaks clustering methods

  • Density peaks clustering

    Rodriguez and Laio, Science (2014)

  • Advanced density peaks clustering

    d’Errico et al., Information Sciences (2021)

  • k-peak clustering

    Sormani, Rodriguez and Laio, JCTC (2020)

  • Manifold comparison tools

  • Neighbourhood overlap

    Doimo et al., NeurIPS (2020)

  • Information imbalance

    Glielmo et al., PNAS Nexus (2022)

Installation

The package is compatible with Python >= 3.7 (tested on 3.7, 3.8 and 3.9). We currently only support Unix-based systems, including Linux and macOS. For Windows-machines we suggest using the Windows Subsystem for Linux (WSL).

The package requires numpy, scipy and scikit-learn, and matplotlib for the visualisations.

The package contains Cython-generated C extensions that are automatically compiled during install.

The latest release is available through pip

pip install dadapy

To install the latest development version, clone the source code from github and install it with pip as follows

git clone https://github.com/sissa-data-science/DADApy.git
cd DADApy
pip install .

Citing DADApy

A description of the package is available here.

Please consider citing it if you found this package useful for your research

@article{dadapy,
    title = {DADApy: Distance-based analysis of data-manifolds in Python},
    journal = {Patterns},
    pages = {100589},
    year = {2022},
    issn = {2666-3899},
    doi = {https://doi.org/10.1016/j.patter.2022.100589},
    url = {https://www.sciencedirect.com/science/article/pii/S2666389922002070},
    author = {Aldo Glielmo and Iuri Macocco and Diego Doimo and Matteo Carli and Claudio Zeni and Romina Wild and Maria d’Errico and Alex Rodriguez and Alessandro Laio},
    }

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dadapy-0.2.0.tar.gz (502.0 kB view details)

Uploaded Source

Built Distribution

dadapy-0.2.0-cp38-cp38-macosx_11_0_arm64.whl (755.2 kB view details)

Uploaded CPython 3.8 macOS 11.0+ ARM64

File details

Details for the file dadapy-0.2.0.tar.gz.

File metadata

  • Download URL: dadapy-0.2.0.tar.gz
  • Upload date:
  • Size: 502.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.8.16

File hashes

Hashes for dadapy-0.2.0.tar.gz
Algorithm Hash digest
SHA256 125abed7dd1717edb105bc0ed74e8001cba95132f8e5739aa3979a4c7ea41d41
MD5 75eb9aa35fa5ecd291fb8cdad40ed081
BLAKE2b-256 af1683ec4bd52384b9513170429146940ac69cb358a0815fd967976c0f5ceb66

See more details on using hashes here.

File details

Details for the file dadapy-0.2.0-cp38-cp38-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dadapy-0.2.0-cp38-cp38-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 cd5a3af114918a07d6529375650478e292d35c4375a004043342e0173dc39204
MD5 bdab3d2fcc89cfef3a6db578e7f44c05
BLAKE2b-256 f0a92ea5dfc348e219c0fb7501d76269e19e72b4a23724539bbf067877fde15b

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page