Skip to main content

DBSOD: Density-Based Spatial Outlier Detection

Project description

DOI

DBSOD: Density-Based Spatial Outlier Detection

Official implementation of "DBSOD: Density-Based Spatial Outlier Detection". Paper preprint is coming soon.

Algorithm

While DBSCAN is a widely used clustering algorithm, it only provides a binary label for outliers and does not assign a continuous outlierness score. To address this limitation, we propose DBSOD, a density-based spatial outlier detection method inspired by DBSCAN. The algorithm estimates the consistency with which a data point is identified as an outlier across a range of neighborhood sizes:

DBSOD Algorithm

The algorithm systematically varies the neighborhood size parameter $\epsilon$, evaluating outlierness across multiple density assumptions. By aggregating binary outlier classifications across these scales, it produces a normalized outlierness score for each point, reflecting how consistently the point is identified as an outlier.

It is also possible to estimate outlierness score for unseen data (novelty detection). Here, each new data point is treated as a non-core candidate for expansion of a cluster obtained from training data. The algorithm then estimates the consistency with which a data point does not expand the cluster.

Installation

Note: the package was developed for Linux (manylinux, x86_64) machines.

You can install package using pip:

pip install dbsod

Alternatively (for instance if you want to contribute) you may clone this repository, build dbsod and install it in .venv in editable mode:

git clone https://github.com/Kowd-PauUh/dbsod.git
cd dbsod
make install_g++
make venv

Usage

Take as an example this dataset:

import numpy as np

DATA = np.array([
    [0.35, 0.18],
    [0.60, 0.16],
    [0.40, 0.18],
    [0.40, 0.30],
    [0.30, 0.70],
])

We can use DBSOD to calculate outlierness score for each point:

from dbsod import dbsod, DBSOD

EPS_SPACE = [0.15, 0.22]  # `eps` parameters used for calculating normalized outlierness score
MIN_PTS = 2               # minimum number of neighbors for the data point to become "core" point

# Initialize model
model = DBSOD(
    eps_space=EPS_SPACE,
    min_pts=MIN_PTS,
)

# Fit model and get outlierness scores of the training data points
# This can be also achieved using:
#     `model.fit_predict(X=DATA)` or `dbsod(X=DATA, eps_space=EPS_SPACE, min_pts=MIN_PTS)`
outlierness_scores = model.fit(X=DATA).outlierness_score
print(outlierness_scores)

The output will be: array([0. , 0.5, 0. , 0. , 1. ]).

Below is the visualization of this example:

Having fitted the DBSOD, we can estimate outlierness scores for the new data points:

# prepare a 3x3 grid of points
x, y = np.meshgrid(np.linspace(0.2, 0.5, 3), np.linspace(0.2, 0.5, 3))
points = np.column_stack([x.ravel(), y.ravel()])

# estimate outlierness score for each new point
predictions = model.predict(points)
print(predictions)

The output of this is: array([0.5, 0. , 0. , 0.5, 0. , 0. , 1. , 0.5, 1. ]).

Notice how we can use DBSOD to estimate outlierness of the points in a grid with an arbitrary resolution. This effectively allows us to create outlierness heatmaps.

On the real-world data (check out this and this examples) result of applying DBSOD would look like this:

Time and space complexity
Method Time complexity (worst case) Space complexity (worst case)
.fit $O(N^2 \cdot (d + logN))$ $O(N^2)$
.predict $O(N \cdot M \cdot d)$ $O(N + M)$

Given that len(eps_space) $\ll N$, where:
  $N$ – number of points used to fit the algorithm;
  $M$ – number of points used to predict scores for;
  $d$ – point dimensionality.

Citation

@software{danylenko2025dbsod, 
  author = {Danylenko, Ivan},
  doi = {10.5281/zenodo.19557644},
  month = apr,
  title = {{DBSOD: Density-Based Spatial Outlier Detection}},
  url = {https://github.com/Kowd-PauUh/dbsod},
  version = {0.1.0},
  year = {2026}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

dbsod-0.2.0-cp314-cp314t-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (1.9 MB view details)

Uploaded CPython 3.14tmanylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

dbsod-0.2.0-cp314-cp314-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (1.9 MB view details)

Uploaded CPython 3.14manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

dbsod-0.2.0-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (1.9 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

dbsod-0.2.0-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (1.9 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

dbsod-0.2.0-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (1.8 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

dbsod-0.2.0-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (1.8 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

File details

Details for the file dbsod-0.2.0-cp314-cp314t-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for dbsod-0.2.0-cp314-cp314t-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 d2b8604ac808a7f8939ad3ef507119477bc85fcc88b4b6cd2ab394453169528a
MD5 2d4aa9cdd86f6b6d93c240f41dc7f037
BLAKE2b-256 d966c24ed7d630dcc08e47c5cd8bab37fc083b79e915a62e3d53297bc106efa6

See more details on using hashes here.

Provenance

The following attestation bundles were made for dbsod-0.2.0-cp314-cp314t-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl:

Publisher: python-publish.yml on Kowd-PauUh/dbsod

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dbsod-0.2.0-cp314-cp314-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for dbsod-0.2.0-cp314-cp314-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 86da7e175e682e3d0b1bf53d54736175eb7ae1da88f731548c608fec7a4ff945
MD5 4f34828c11c102019edc661c537f537c
BLAKE2b-256 c0762f88c84f84e13263098aa0c5868b5774052b4688347fe9ceb95e2448255d

See more details on using hashes here.

Provenance

The following attestation bundles were made for dbsod-0.2.0-cp314-cp314-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl:

Publisher: python-publish.yml on Kowd-PauUh/dbsod

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dbsod-0.2.0-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for dbsod-0.2.0-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 b38fb3eb7418bc844008b7e69f26cca1061275f7ceea77421d65dd1c1e42393d
MD5 7c21082ec0a4680c7f051b25ac705a1b
BLAKE2b-256 38894403270dc75dfd95e1aefcced2d84f6f21de1f0c20902054cdf6e44d8a97

See more details on using hashes here.

Provenance

The following attestation bundles were made for dbsod-0.2.0-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl:

Publisher: python-publish.yml on Kowd-PauUh/dbsod

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dbsod-0.2.0-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for dbsod-0.2.0-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 2755e97f6312d0b54691aba33197bb1254f4a3903f96b154821589ce0a2228c9
MD5 34c4f9fb0f2e5b8e63501541968e2a5c
BLAKE2b-256 28834a8c4f1c0da4306770bc0079376faf97bcd9dc7fea38dc91d415979c8889

See more details on using hashes here.

Provenance

The following attestation bundles were made for dbsod-0.2.0-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl:

Publisher: python-publish.yml on Kowd-PauUh/dbsod

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dbsod-0.2.0-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for dbsod-0.2.0-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 d0388888dcc1af7d4343bbdf26657fc7ebee3176975432fa251b1d36c8283e08
MD5 79f18647a2f3bf0c7c8dea0130f9bc35
BLAKE2b-256 04283fa74f1f7b76fc232b146a8380a5e7418641c117aa91feafea48c2342dd8

See more details on using hashes here.

Provenance

The following attestation bundles were made for dbsod-0.2.0-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl:

Publisher: python-publish.yml on Kowd-PauUh/dbsod

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dbsod-0.2.0-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for dbsod-0.2.0-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 9f4659233be0038174de535f4cecc96599f7ea6e956c0fe9d658b68548fb2325
MD5 41f0f94dd47834e82c4fa60b9ecb3c1e
BLAKE2b-256 64184926e43e567c67925621c66f7d4989cb98ea9b81dca528d0752c7262642d

See more details on using hashes here.

Provenance

The following attestation bundles were made for dbsod-0.2.0-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl:

Publisher: python-publish.yml on Kowd-PauUh/dbsod

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page