dbscan1d is a package for DBSCAN on 1D arrays
Project description
DBSCAN1D
dbscan1d is a 1D implementation of the DBSCAN algorithm. It was created to efficiently preform clustering on large 1D arrays.
Sci-kit Learn's DBSCAN implementation does not have a special case for 1D, where calculating the full distance matrix is wasteful. It is much better to simply sort the input array and performing efficient bisects for finding closest points. Here are the results of running the simple profile script included with the package. In every case DBSCAN1D is much faster than scikit learn's implementation.
Installation
Simply use pip to install dbscan1d:
pip install dbscan1d
It only requires numpy.
Quickstart
dbscan1d is designed to be interchangable with sklearn's implementation in alnmost
all cases. The exception is that the weights
parameter is not yet supported.
from sklearn.datasets import make_blobs
from dbscan1d.core import DBSCAN1D
# make blobs to test clustering on
X = make_blobs(1_000_000, centers=2, n_features=1)[0]
# init dbscan object
dbs = DBSCAN1D(eps=.5, min_samples=4)
labels = dbs.fit_predict(X)
# show core point indices
dbs.core_sample_indices_
# get values of core points
dbs.components_
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.