Skip to main content

Distance-based Analysis of DAta-manifolds in python

Project description

Code style: black Imports: isort codecov GitHub Workflow Status GitHub Workflow Status GitHub Workflow Status

DADApy is a Python package for the characterization of manifolds in high-dimensional spaces.

Homepage

For more details and tutorials, visit the homepage at: https://dadapy.readthedocs.io/

Quick Example

import numpy as np
from dadapy.data import Data

# Generate a simple 3D gaussian dataset
X = np.random.normal(0, 1, (1000, 3))

# initialize the "Data" class with the set of coordinates
data = Data(X)

# compute distances up to the 100th nearest neighbor
data.compute_distances(maxk=100)

# compute the intrinsic dimension using 2nn estimator
id, id_error, id_distance = data.compute_id_2NN()

# compute the intrinsic dimension up to the 64th nearest neighbors using Gride
id_list, id_error_list, id_distance_list = data.return_id_scaling_gride(range_max=64)

# compute the density using PAk, a point adaptive kNN estimator
log_den, log_den_error = data.compute_density_PAk()

# find the peaks of the density profile through the ADP algorithm
cluster_assignment = data.compute_clustering_ADP()

# compute the neighborhood overlap with another dataset
X2 = np.random.normal(0, 1, (1000, 5))
overlap_x2 = data.return_data_overlap(X2)

# compute the neighborhood overlap with a set of labels
labels = np.repeat(np.arange(10), 100)
overlap_labels = data.return_label_overlap(labels)

Currently implemented algorithms

  • Intrinsic dimension estimators

  • Two-NN estimator

    Facco et al., Scientific Reports (2017)

  • Gride estimator

    Denti et al., Scientific Reports (2022)

  • I3D estimator (for both continuous and discrete spaces)

    Macocco et al., Physical Review Letters (2023)

  • BID estimator

    Acevedo et al., Nature Communications Physics (2025)

  • Density estimators

  • kNN estimator

  • k*NN estimator (kNN with an adaptive choice of k)

  • PAk estimator

    Rodriguez et al., JCTC (2018)

  • point-adaptive mean-shift gradient estimator

    Carli et al., ArXiv (2024)

  • BMTI estimator

    Carli et al., ArXiv (2024)

  • Density peaks clustering methods

  • Density peaks clustering

    Rodriguez and Laio, Science (2014)

  • Advanced density peaks clustering

    d’Errico et al., Information Sciences (2021)

  • k-peak clustering

    Sormani, Rodriguez and Laio, JCTC (2020)

  • Manifold comparison tools

  • Neighbourhood overlap

    Doimo et al., NeurIPS (2020)

  • Information imbalance

    Glielmo et al., PNAS Nexus (2022)

  • Feature selection and weighting tool

  • Differentiable Information Imbalance

    Wild et al., Nature Communications (2025)

  • Causal analysis tools

  • Imbalance Gain

    Del Tatto et al., PNAS (2024)

  • Community causal graph

    Allione et al., arXiv (2025)

Installation

The package is compatible with the Python versions 3.8, 3.9, 3.10, 3.11, and 3.12. The methods of the classes DiffImbalance and CausalGraph are only compatible with Python>=3.9. We currently only support Unix-based systems, including Linux and macOS. For Windows machines, we suggest using the Windows Subsystem for Linux (WSL).

The package requires numpy, scipy, scikit-learn, jax, jaxlib and matplotlib for the visualizations.

The package contains Cython-generated C extensions that are automatically compiled during installation.

The latest release is available through pip:

pip install dadapy

To install the latest development version, clone the source code from GitHub and install it with pip as follows:

pip install git+https://github.com/sissa-data-science/DADApy

Alternatively, if you'd like to modify the implementation of some function locally you can download the repository and install the package with:

git clone https://github.com/sissa-data-science/DADApy.git
cd DADApy
python setup.py build_ext --inplace
pip install .

The methods of the classes DiffImbalance and CausalGraph can be run on GPU, using a suitable installation of JAX on a GPU platform. The code has been tested using JAX v0.4.30 with CUDA 12, which can be installed with:

pip install --upgrade "jax[cuda12_pip]==0.4.30" -f https://storage.googleapis.com/jax-releases/jax_cuda_releases.html

For more information on the installation of the JAX library on GPUs see the official repository.

Citing DADApy

A description of the package is available here.

Please consider citing it if you found this package useful for your research:

@article{dadapy,
    title = {DADApy: Distance-based analysis of data-manifolds in Python},
    journal = {Patterns},
    pages = {100589},
    year = {2022},
    issn = {2666-3899},
    doi = {https://doi.org/10.1016/j.patter.2022.100589},
    url = {https://www.sciencedirect.com/science/article/pii/S2666389922002070},
    author = {Aldo Glielmo and Iuri Macocco and Diego Doimo and Matteo Carli and Claudio Zeni and Romina Wild and Maria d’Errico and Alex Rodriguez and Alessandro Laio},
    }

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dadapy-0.3.3.tar.gz (1.2 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dadapy-0.3.3-cp311-cp311-macosx_11_0_arm64.whl (1.8 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

File details

Details for the file dadapy-0.3.3.tar.gz.

File metadata

  • Download URL: dadapy-0.3.3.tar.gz
  • Upload date:
  • Size: 1.2 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.11

File hashes

Hashes for dadapy-0.3.3.tar.gz
Algorithm Hash digest
SHA256 f30cc52201e3577a6254cd66d13f030d89f185082a0af8b6757ea98461c1aecb
MD5 5fc9fb8f2fff5deb7649d3bea79ff08b
BLAKE2b-256 af62d3fd8f35fa14ac40c95e117b547512a9f05f17f5b08d5d9533d5ced4ab3a

See more details on using hashes here.

File details

Details for the file dadapy-0.3.3-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dadapy-0.3.3-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 9b22b04c3a29025ce459833eebda95fa76130a6e2aea95834b62b2155b3d54ef
MD5 acaec4de20830cff41fbf4a6a0c7d241
BLAKE2b-256 feeb2ae2b1682c3215749d89108570bee214cea263f1631e1da4f2874c4d13ac

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page