Skip to main content

No project description provided

Project description

flex-anomalies

flex-anomalies is a Python library dedicated to anomaly detection in machine learning. It offers a wide range of algorithms and techniques, including models based on distance, density, trees, and neural networks such as convolutional and recurrent architectures. The library also provides aggregators, anomaly score processing techniques, and pre-processing techniques for data.

Anomaly detection involves examining data and detecting deviations or anomalies present in the data, with the goal of purifying data sets and identifying anomalies for further analysis.

Details

Anomaly Detection with FLEXible Federated Learning: This repository contains implementations of anomaly detection algorithms using the Flexible Federated Learning library. FLEXible is a Python library for realizing federated learning in an efficient and scalable manner. From the study of state-of-the-art research works on federated learning for network intrusion detection.

This repository also includes:

  • An organized folder structure that makes it easy to navigate and understand the project.
  • Explanatory notebooks showing practical examples and detailed explanations for the use of the library.

Folder structure

  • flexanomalies/pool: Here are the aggregators and primitives for each of the models following the FLEXible structure.
  • flexanomalies/utils: Contains the source code of the implementations of the anomaly detection algorithms, anomaly score processing techniques, metrics for the evaluation, function to federate a centralized dataset using FLEXible and data loading.
  • flexanomalies/datasets: some pre-processing techniques for data.
  • notebooks: Contains explanatory notebooks showing how to use the anomaly detection algorithms on data.

Explanatory Notebooks

  • AnomalyDetection_Autoencoder_FLEX.ipynb: A notebook showing a step-by-step example of how to use Auto Encoder model for anomaly detection with federated learning for static data.
  • AnomalyDetection_AutoEncoder_FLEX_ts.ipynb: Notebook showing a step-by-step example of how to use the Auto Encoder model for anomaly detection with federated learning for time series.The structure of the sliding window, data federation, federated training and model evaluation at the server and client level.
  • AnomalyDetection_PCA_FLEX.ipynb: A notebook demonstrating the application of PCA_Anomaly for anomaly detection with federated learning for a static dataset.
  • AnomalyDetection_Cluster_FLEX.ipynb: Notebook showing a step-by-step example of how to use the ClusterAnomaly model for anomaly detection with federated learning for static data and evaluating the model on test sets.
  • AnomalyDetection_IsolationForest_FLEX.ipynb: Notebook showing an example of how to use the IsolationForest model with federated learning for an example set of static data. From data federation and training to model evaluation on a test set.
  • AnomalyDetection_CNNN_LSTM_FLEX_ts.ipynb: Notebook showing the use of the DeepCNN_LSTM model with federated learning for anomaly detection in time series. The structure of the sliding window, data federation, federated training and model evaluation at server and client level.

Features

For more information on the implemented algorithms see the table that follows:

Models Description Citation
IsolationForest Algorithm for data anomaly detection, detects anomalies using binary trees. Liu, F.T., Ting, K.M. and Zhou, Z.H., 2008, December. Isolation forest. In *International Conference on Data Mining*\ , pp. 413-422. IEEE.
PCA_Anomaly Principal component analysis (PCA), algorithm for detecting outlier.Outlier scores can be obtained as the sum of weighted euclidean distance between each sample to the hyperplane constructed by the selected eigenvectors Shyu, M.L., Chen, S.C., Sarinnapakorn, K. and Chang, L., 2003. A novel anomaly detection scheme based on principal component classifier. *MIAMI UNIV CORAL GABLES FL DEPT OF ELECTRICAL AND COMPUTER ENGINEERING*.
ClusterAnomaly Model based on clustering. Outliers scores are solely computed based on their distance to the closest large cluster center, kMeans is used for clustering algorithm. Chawla, S., & Gionis, A. (2013, May). k-means–: A unified approach to clustering and outlier detection. In Proceedings of the 2013 SIAM international conference on data mining (pp. 189-197).
DeepCNN_LSTM Neural network model for time series and static data including convolutional and recurrent architecture. Aguilera-Martos, I., García-Vico, Á. M., Luengo, J., Damas, S., Melero, F. J., Valle-Alonso, J. J., & Herrera, F. (2022). TSFEDL: A Python Library for Time Series Spatio-Temporal Feature Extraction and Prediction using Deep Learning (with Appendices on Detailed Network Architectures and Experimental Cases of Study). arXiv preprint arXiv:2206.03179.
AutoEncoder Fully connected AutoEncoder for time series and static data. Neural network for learning useful data representations unsupervisedly. detect anomalies in the data by calculating the reconstruction. Aggarwal, C.C., 2015. Outlier analysis. In Data mining (pp. 237-263), Ch.3. Springer, Cham. Ch.3

Installation

FLEX-Anomalies is available on the PyPi repository and can be easily installed using:

pip: pip install flexanomalies

Install the necessary dependencies:

pip install -r requirements.txt

License

This project is licensed under the MIT License - see the LICENSE file for details.

Citation

If you use this repository in your research work, please cite the Flexible paper:

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

flexanomalies-0.0.2.tar.gz (16.1 kB view details)

Uploaded Source

Built Distribution

flexanomalies-0.0.2-py3-none-any.whl (15.7 kB view details)

Uploaded Python 3

File details

Details for the file flexanomalies-0.0.2.tar.gz.

File metadata

  • Download URL: flexanomalies-0.0.2.tar.gz
  • Upload date:
  • Size: 16.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.13

File hashes

Hashes for flexanomalies-0.0.2.tar.gz
Algorithm Hash digest
SHA256 c0cf8613d6cc9a09c2779b43d16673deeed9b2fb00cefd08d2bda3f0b40035be
MD5 d62408e29623f143fcf9086e8e07f810
BLAKE2b-256 071810f23f0d8c4356391371442d0b51c148e2dcde3d9a5465ee9a4319b15f89

See more details on using hashes here.

File details

Details for the file flexanomalies-0.0.2-py3-none-any.whl.

File metadata

File hashes

Hashes for flexanomalies-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 2d32f4e05bb365b2d65a4252245a9849f06fe97dbb7a30aa62c1fbd08490d638
MD5 ae23b0ed3a81eb508887cc2f4cbae8d4
BLAKE2b-256 5c18755abeb2e0d7de4c9a7dd14648ec2798eb9d64e28ec087a744e87b3cc66a

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page