Skip to main content

Fast Density Clustering in low-dimension

Project description

Fast density clustering (fdc)

Python package for clustering low-dimensional data using kernel density maps and density graph. Examples for gaussian mixtures and some benchmarks are provided. Our algorithm solves multiscale problems (multiple variances/densities and population sizes) and works for non-convex clusters. It uses cross-validation and is regularized by two main global parameters : a neighborhood size and a noise threshold measure. The later detects spurious cluster centers while the former guarantees that only local information is used to infer cluster centers.

The underlying code is based on fast KD-trees for nearest-neighbor searches. For low-dimensional spaces, the algorithm has a O(n log n), where n is the size of the dataset. Is also has a memory complexity of O(n).

Installing

I suggest you install the code using pip from an Anaconda Python 3 environment. From that environment:

pip install fdc

That's it ! You can now import the package fdc from your Python scripts. Check out the examples in the file example and see if you can run the scripts provided.

Examples and comparison with other methods

Check out the example for gaussian mixtures (example.py). You should be able to run it directly. It should produce a plot similar to this: alt tag

In another example (example2.py), the algorithm is benchmarked against some sklearn datasets (note that the same parameters are used across all datasets). This is to be compared with other clustering methods easily accesible from sklearn.

alt tag

Citation

If you use this code in a scientific publication, I would appreciate citation/reference to this repository. Also, for further references on clustering and machine learning check out our machine learning review:

@article{mehta2018high,
  title={A high-bias, low-variance introduction to Machine Learning for physicists},
  author={Mehta, Pankaj and Bukov, Marin and Wang, Ching-Hao and Day, Alexandre GR and Richardson, Clint and Fisher, Charles K and Schwab, David J},
  journal={arXiv preprint arXiv:1803.08823},
  year={2018}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fdc-1.16.tar.gz (44.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

fdc-1.16-py3-none-any.whl (38.3 kB view details)

Uploaded Python 3

File details

Details for the file fdc-1.16.tar.gz.

File metadata

  • Download URL: fdc-1.16.tar.gz
  • Upload date:
  • Size: 44.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for fdc-1.16.tar.gz
Algorithm Hash digest
SHA256 28b655d74c80901e16486c7a271afaa9147c21ce76c62d8af479ce4f6d2dd4c8
MD5 f9db5bb1ba1c410cc7cc583b29719782
BLAKE2b-256 6990a273135cd1d75eb4770c79706d314513a3f7a6344e0be27b87e55c7a44a6

See more details on using hashes here.

Provenance

The following attestation bundles were made for fdc-1.16.tar.gz:

Publisher: publish.yml on alexandreday/fast_density_clustering

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file fdc-1.16-py3-none-any.whl.

File metadata

  • Download URL: fdc-1.16-py3-none-any.whl
  • Upload date:
  • Size: 38.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for fdc-1.16-py3-none-any.whl
Algorithm Hash digest
SHA256 5429f2fc1981a72fbcaefc70232559c97067bfa1aadf57bebe097ac95acd7c34
MD5 99d66f93f12d838af460bb8888fa39aa
BLAKE2b-256 cc9d90708a2ac95e582606f430a2d7a39282af61d1d1fe82dcac2c70f4e5fb27

See more details on using hashes here.

Provenance

The following attestation bundles were made for fdc-1.16-py3-none-any.whl:

Publisher: publish.yml on alexandreday/fast_density_clustering

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page