Skip to main content

Bayesian Histogram-based Anomaly Detection

Project description

Bayesian Histogram-based Anomaly Detection (BHAD)

Python implementation of the BHAD algorithm as presented in Vosseler, A. (2023): BHAD: Explainable anomaly detection using Bayesian histograms. The bhad package follows Scikit-learn's standard API for outlier detection.

Installation

pip install bhad

Usage

1.) Preprocess the input data: discretize continuous features and conduct Bayesian model selection (optionally).

2.) Train the model using discrete data.

For convenience these two steps can be wrapped up via a scikit-learn pipeline (optionally).

from bhad.model import BHAD
from bhad.utils import Discretize
from sklearn.pipeline import Pipeline

num_cols = [....]   # names of numeric features
cat_cols = [....]   # categorical features

pipe = Pipeline(steps=[
   ('discrete', Discretize(nbins = None)),   
   ('model', BHAD(contamination = 0.01, num_features = num_cols, cat_features = cat_cols))
])

For a given dataset get binary model decisons:

y_pred = pipe.fit_predict(X = dataset)        

Get global model explanation as well as for individual observations:

from bhad.explainer import Explainer

local_expl = Explainer(pipe.named_steps['model'], pipe.named_steps['discrete']).fit()

local_expl.get_explanation(nof_feat_expl = 5, append = False)   # individual explanations

local_expl.global_feat_imp                                      # global explanation

A detailed toy example using synthetic data for anomaly detection can be found here and an example using the Titanic dataset illustrating model explanability can be found here.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bhad-0.1.0.tar.gz (19.1 kB view details)

Uploaded Source

Built Distribution

bhad-0.1.0-py3-none-any.whl (19.3 kB view details)

Uploaded Python 3

File details

Details for the file bhad-0.1.0.tar.gz.

File metadata

  • Download URL: bhad-0.1.0.tar.gz
  • Upload date:
  • Size: 19.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.8.16

File hashes

Hashes for bhad-0.1.0.tar.gz
Algorithm Hash digest
SHA256 b6854fffc58f12322c979d1d019b8a52b9613824df1748622456b01481e964f8
MD5 e928d73ccac9c35289a77bc865b660b3
BLAKE2b-256 cd79f91da89721d5b2d7e7af36221d659869408cef7539fbee8cf1cc2a1a817a

See more details on using hashes here.

File details

Details for the file bhad-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: bhad-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 19.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.8.16

File hashes

Hashes for bhad-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 0bddc670d5630507c23bf911f2794f8841bde7bf58e8830778bcf3bc61f3e6a8
MD5 51525d4f7ec94168be93323917f10ed8
BLAKE2b-256 a2f77015c5b11cc5b7ba1d7b50f91517cd3f591c9f7b94db5a45f093f6cfec60

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page