A package for ADESS
Project description
Introduction
This is the code for my master's thesis: Anomaly Detection Using an Ensemble with Simple Sub-models, 2024. The algorithm explores the effectiveness of an ensemble of simple sub-models like linear regression in detecting anomalies.
Installation
Install the package
Parameters
Run with --help to see the parameters
adess --help
usage: adess [-h] --train TRAIN --test TEST [--feat_sel_percent FEAT_SEL_PERCENT] [--max_feats MAX_FEATS] [--order ORDER] [--computation_budget COMPUTATION_BUDGET] [--no_submodels NO_SUBMODELS] [--prep PREP] [--extract EXTRACT]
[--submodel SUBMODEL]
ADESS: Anomaly Detection using Ensemble of simple sub=models
options:
-h, --help show this help message and exit
--train TRAIN Training data (default: None)
--test TEST Testing data (default: None)
--feat_sel_percent FEAT_SEL_PERCENT
Feature selection percentage (default: 0.2)
--max_feats MAX_FEATS
Maximum number of features (default: 50)
--order ORDER Degree of polynomials for feature bagging (default: 2)
--computation_budget COMPUTATION_BUDGET
Computation budget in seconds (default: 600)
--no_submodels NO_SUBMODELS
Count of submodels in the ensemble (default: 500)
--prep PREP List of preprocessing options (choose one or many): [skel,canny,clahe,blur,augment,gray,norm,std,none] (default: ['norm'])
--extract EXTRACT Feature selection option (choose one): [rbm,tsne,pca,ica,nmf,ae,none] (default: pca)
--submodel SUBMODEL Submodel type option (choose one): [lin,lasso,elastic,svm] (default: lin)
CLI
adess --train path/to/train --test path/to/test
Example:
adess --train train.npy --test test.npy
Output:
X_train.shape = (353, 10), X_test.shape = (89, 10), 'feat_sel_percent = 0.2', 'max_feats = 50', 'order = 2', 'computation_budget = 600', 'no_submodels = 500', 'prep = norm', 'extract = pca', 'submodel = lin'
100%|█████████████████████████████████████████| 500/500 [00:00<00:00, 1094.41it/s]
Mean of Predicted Y = 8.005149077019986e-32, Count of submodel executed = 500
Python
- Import the sklearn diabetes dataset as an example.
- Split and load the dataset to the adess() function. X_test will be used to predict 'y'.
- The mean prediction (of y) and the (default) ensemble size are printed.
>>> from adess.adess import adess
>>> from sklearn.datasets import load_diabetes
>>> from sklearn.model_selection import train_test_split
>>> X, y= load_diabetes(return_X_y=True)
>>> X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
>>> adess(X_train,X_test)
100%|█████████████████████████████████████████| 500/500 [00:00<00:00, 1162.27it/s]
(3.2565381430216842e-31, 500)
>>>
Results from the thesis experiments
The AUROCs of the runs reported in the thesis are stored in this Google Sheet
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file adess-1.0.0.tar.gz.
File metadata
- Download URL: adess-1.0.0.tar.gz
- Upload date:
- Size: 1.5 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
acff65b908d516adbedd1db477de9c94f9820b8756d0d2f5eab8895ee0c02dab
|
|
| MD5 |
642cf7812a41322c08d2cb6900e1cc08
|
|
| BLAKE2b-256 |
b49ab20d8cf74bbcfdef9c97b74d6614b5aa8bc8e3cb164dc152183493e8b918
|
File details
Details for the file adess-1.0.0-py3-none-any.whl.
File metadata
- Download URL: adess-1.0.0-py3-none-any.whl
- Upload date:
- Size: 13.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
437e1f075526c9e7073ac8304ea0545e4f7747370a1e2098688c5a82014ccfd7
|
|
| MD5 |
34dcecd1f0f37e98517201117cd60af5
|
|
| BLAKE2b-256 |
5f0560aaef819bf3f41c398c1e7c3362b4b85047e8cfaf227763850c2cf93178
|