Fast Density Clustering in low-dimension
Project description
Fast density clustering (fdc)
Python package for clustering low-dimensional data using kernel density maps and density graph. Examples for gaussian mixtures and some benchmarks are provided. Our algorithm solves multiscale problems (multiple variances/densities and population sizes) and works for non-convex clusters. It uses cross-validation and is regularized by two main global parameters : a neighborhood size and a noise threshold measure. The later detects spurious cluster centers while the former guarantees that only local information is used to infer cluster centers.
The underlying code is based on fast KD-trees for nearest-neighbor searches. For low-dimensional spaces, the algorithm has a O(n log n), where n is the size of the dataset. Is also has a memory complexity of O(n).
Installing
I suggest you install the code using pip
from an Anaconda Python 3 environment. From that environment:
pip install fdc
That's it ! You can now import the package fdc
from your Python scripts. Check out the examples
in the file example
and see if you can run the scripts provided.
Examples and comparison with other methods
Check out the example for gaussian mixtures (example.py). You should be able to run it directly. It should produce a plot similar to this:
In another example (example2.py), the algorithm is benchmarked against some sklearn datasets (note that the same parameters are used across all datasets). This is to be compared with other clustering methods easily accesible from sklearn.
Citation
If you use this code in a scientific publication, I would appreciate citation/reference to this repository. Also, for further references on clustering and machine learning check out our machine learning review:
@article{mehta2018high,
title={A high-bias, low-variance introduction to Machine Learning for physicists},
author={Mehta, Pankaj and Bukov, Marin and Wang, Ching-Hao and Day, Alexandre GR and Richardson, Clint and Fisher, Charles K and Schwab, David J},
journal={arXiv preprint arXiv:1803.08823},
year={2018}
}
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file fdc-1.15.tar.gz
.
File metadata
- Download URL: fdc-1.15.tar.gz
- Upload date:
- Size: 39.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.11.0 pkginfo/1.4.2 requests/2.18.4 setuptools/40.0.0 requests-toolbelt/0.8.0 tqdm/4.23.4 CPython/3.6.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 285f261d766dd7160a7d0a5695f3e64f364bff5a5cf0135a8df60b5d2af10598 |
|
MD5 | 0fde87b6fff61daf7ea090c4c24993e4 |
|
BLAKE2b-256 | 9ee52e67c0e7bf7c1547ff4a337c71f254e2c4c106195614230659b04763f876 |
File details
Details for the file fdc-1.15-py3-none-any.whl
.
File metadata
- Download URL: fdc-1.15-py3-none-any.whl
- Upload date:
- Size: 45.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.11.0 pkginfo/1.4.2 requests/2.18.4 setuptools/40.0.0 requests-toolbelt/0.8.0 tqdm/4.23.4 CPython/3.6.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3df8325fc732ad9db9b12abf607c987bf6a6af0d5b1fe0448c2d9d192bcb2a2d |
|
MD5 | cba9a65d2d5c6c8112016e6ddea20072 |
|
BLAKE2b-256 | aee327f3d0e0a536d94425aea1116eb7615eb961398cc1aa3f2c84ed063aedfc |