Skip to main content

dbscan1d is a package for DBSCAN on 1D arrays

Project description


dbscan1d is a 1D implementation of the DBSCAN algorithm. It was created to efficiently preform clustering on large 1D arrays.

Sci-kit Learn's DBSCAN implementation does not have a special case for 1D, where calculating the full distance matrix is wasteful. It is much better to simply sort the input array and performing efficient bisects for finding closest points. Here are the results of running the simple profile script included with the package. In every case DBSCAN1D is much faster than scikit learn's implementation.



Simply use pip to install dbscan1d:

pip install dbscan1d

It only requires numpy.


dbscan1d is designed to be interchangable with sklearn's implementation in alnmost all cases. The exception is that the weights parameter is not yet supported.

from sklearn.datasets import make_blobs

from dbscan1d.core import DBSCAN1D

# make blobs to test clustering
X = make_blobs(1_000_000, centers=2, n_features=1)[0]

# init dbscan object
dbs = DBSCAN1D(eps=.5, min_samples=4)

# get labels for each point
labels = dbs.fit_predict(X)

# show core point indices

# get values of core points

Project details

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for dbscan1d, version 0.1.4
Filename, size File type Python version Upload date Hashes
Filename, size dbscan1d-0.1.4-py3-none-any.whl (7.0 kB) File type Wheel Python version py3 Upload date Hashes View
Filename, size dbscan1d-0.1.4.tar.gz (4.7 kB) File type Source Python version None Upload date Hashes View

Supported by

Pingdom Pingdom Monitoring Google Google Object Storage and Download Analytics Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page