Skip to main content

A package for ADESS

Project description

Introduction

This is the code for my master's thesis: Anomaly Detection Using an Ensemble with Simple Sub-models, 2024. The algorithm explores the effectiveness of an ensemble of simple sub-models like linear regression in detecting anomalies.

Installation

Install the package

Parameters

Run with --help to see the parameters

adess --help
usage: adess [-h] --train TRAIN --test TEST [--feat_sel_percent FEAT_SEL_PERCENT] [--max_feats MAX_FEATS] [--order ORDER] [--computation_budget COMPUTATION_BUDGET] [--no_submodels NO_SUBMODELS] [--prep PREP] [--extract EXTRACT]
             [--submodel SUBMODEL]

ADESS: Anomaly Detection using Ensemble of simple sub=models

options:
  -h, --help            show this help message and exit
  --train TRAIN         Training data (default: None)
  --test TEST           Testing data (default: None)
  --feat_sel_percent FEAT_SEL_PERCENT
                        Feature selection percentage (default: 0.2)
  --max_feats MAX_FEATS
                        Maximum number of features (default: 50)
  --order ORDER         Degree of polynomials for feature bagging (default: 2)
  --computation_budget COMPUTATION_BUDGET
                        Computation budget in seconds (default: 600)
  --no_submodels NO_SUBMODELS
                        Count of submodels in the ensemble (default: 500)
  --prep PREP           List of preprocessing options (choose one or many): [skel,canny,clahe,blur,augment,gray,norm,std,none] (default: ['norm'])
  --extract EXTRACT     Feature selection option (choose one): [rbm,tsne,pca,ica,nmf,ae,none] (default: pca)
  --submodel SUBMODEL   Submodel type option (choose one): [lin,lasso,elastic,svm] (default: lin)

CLI

adess --train path/to/train --test path/to/test

Example:

adess --train train.npy --test test.npy

Output:

X_train.shape = (353, 10), X_test.shape = (89, 10), 'feat_sel_percent = 0.2', 'max_feats = 50', 'order = 2', 'computation_budget = 600', 'no_submodels = 500', 'prep = norm', 'extract = pca', 'submodel = lin'
100%|█████████████████████████████████████████| 500/500 [00:00<00:00, 1094.41it/s]
Mean of Predicted Y = 8.005149077019986e-32, Count of submodel executed = 500

Python

  1. Import the sklearn diabetes dataset as an example.
  2. Split and load the dataset to the adess() function. X_test will be used to predict 'y'.
  3. The mean prediction (of y) and the (default) ensemble size are printed.
>>> from adess.adess import adess
>>> from sklearn.datasets import load_diabetes
>>> from sklearn.model_selection import train_test_split
>>> X, y= load_diabetes(return_X_y=True)
>>> X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
>>> adess(X_train,X_test)
100%|█████████████████████████████████████████| 500/500 [00:00<00:00, 1162.27it/s]
(3.2565381430216842e-31, 500)
>>> 

Results from the thesis experiments

The AUROCs of the runs reported in the thesis are stored in this Google Sheet

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

adess-1.0.0.tar.gz (1.5 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

adess-1.0.0-py3-none-any.whl (13.7 kB view details)

Uploaded Python 3

File details

Details for the file adess-1.0.0.tar.gz.

File metadata

  • Download URL: adess-1.0.0.tar.gz
  • Upload date:
  • Size: 1.5 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for adess-1.0.0.tar.gz
Algorithm Hash digest
SHA256 acff65b908d516adbedd1db477de9c94f9820b8756d0d2f5eab8895ee0c02dab
MD5 642cf7812a41322c08d2cb6900e1cc08
BLAKE2b-256 b49ab20d8cf74bbcfdef9c97b74d6614b5aa8bc8e3cb164dc152183493e8b918

See more details on using hashes here.

File details

Details for the file adess-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: adess-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 13.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for adess-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 437e1f075526c9e7073ac8304ea0545e4f7747370a1e2098688c5a82014ccfd7
MD5 34dcecd1f0f37e98517201117cd60af5
BLAKE2b-256 5f0560aaef819bf3f41c398c1e7c3362b4b85047e8cfaf227763850c2cf93178

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page