Skip to main content

Bayesian Histogram-based Anomaly Detection

Project description

Bayesian Histogram Anomaly Detection (BHAD)

PyPI version Python 3.12+ License: MIT

A Python implementation of the Bayesian Histogram-based Anomaly Detection (BHAD) algorithm for unsupervised anomaly detection with explainability features.

Overview

BHAD is an explainable anomaly detection method that leverages Bayesian inference and histogram-based modeling to identify outliers in high-dimensional datasets. The algorithm provides both global and local explainability due to its linear structure, making it particularly valuable for applications requiring interpretable results.

Key Features

  • Explainable AI: Provides both global and local explanations for anomaly predictions
  • Bayesian Approach: Uses Bayesian inference for robust uncertainty quantification
  • High-Dimensional Data: Handles high-dimensional datasets effectively
  • Unsupervised Learning: No labeled data required for training
  • Linear Structure: Interpretable model architecture

Installation

Using uv

Install package via uv:

uv venv --python 3.12
uv add bhad

Using pip

python3 -m venv .venv
source .venv/bin/activate
pip install bhad

Quick Start

import numpy as np
import pandas as pd
from bhad.model import BHAD

# Load your data
X = pd.DataFrame(np.random.randn(1000, 10), 
                 columns=[f'feature_{i}' for i in range(10)])

# Create BHAD model with integrated discretization
model = BHAD(contamination=0.01, nbins=None, verbose=False)

# Fit the model and predict anomalies
anomaly_labels = model.fit_predict(X)        # Returns -1 for outliers, 1 for inliers
anomaly_scores = model.decision_function(X)

Documentation

For detailed usage examples, API reference, and tutorials, visit our documentation.

Examples

The package includes Jupyter notebooks with practical examples:

  • Toy_Example.ipynb: Simulated data demonstration
  • Titanic_Example.ipynb: Real-world dataset application

Research & Publications

This implementation is based on the following research papers:

  1. Vosseler, A. (2022): Unsupervised Insurance Fraud Prediction Based on Anomaly Detector Ensembles

  2. Vosseler, A. (2023): BHAD: Explainable anomaly detection using Bayesian histograms

Conference Presentations

Contributing

Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Author

Alexander Vosseler

Citation

If you use BHAD in your research, please cite:

@article{vosseler2022unsupervised,
  title={Unsupervised Insurance Fraud Prediction Based on Anomaly Detector Ensembles},
  author={Vosseler, Alexander},
  journal={Risks},
  volume={10},
  number={7},
  year={2022},
  month={June}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bhad-0.2.9.1.tar.gz (23.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

bhad-0.2.9.1-py3-none-any.whl (22.6 kB view details)

Uploaded Python 3

File details

Details for the file bhad-0.2.9.1.tar.gz.

File metadata

  • Download URL: bhad-0.2.9.1.tar.gz
  • Upload date:
  • Size: 23.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for bhad-0.2.9.1.tar.gz
Algorithm Hash digest
SHA256 c3796921def906c34ccb47896131b626ed3ade983f3689c94dc1e5ecab732675
MD5 54721f1355c90cc553c9e4ea9f544a5a
BLAKE2b-256 835e5914f63ea9e22d3b8b3cd698ba1c14492c232f38384051ddeeda4d2da94a

See more details on using hashes here.

File details

Details for the file bhad-0.2.9.1-py3-none-any.whl.

File metadata

  • Download URL: bhad-0.2.9.1-py3-none-any.whl
  • Upload date:
  • Size: 22.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for bhad-0.2.9.1-py3-none-any.whl
Algorithm Hash digest
SHA256 1c52e462a4f9c873efc0e85de1c0809bb00aedc370cbd9f0e46341e388bce33d
MD5 95bb301ebeeba66dc8e2112a9e2eda2c
BLAKE2b-256 41868f28a5d0e9a72274988e3ed5cc457c6b02c5c8a4d5ee30ba9d40ad9ad0ce

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page