Skip to main content

Graph Outlier/Anomaly Detection in Python

Project description

PyGOD Logo

<tba>These badges will work once it is public<tba>

PyPI version Documentation status GitHub stars GitHub forks testing License

PyGOD is a comprehensive Python library for detecting outlying objects in graphs. This exciting yet challenging field has many key applications in fraud detection [6] and fake news detection [4].

PyGOD includes more than 10 latest graph-based detection algorithms, such as Dominant (SDM’19) and coLA (TNNLS’21). For consistently and accessibility, PyGOD is developed on top of PyTorch Geometric (PyG) and PyTorch, and follows the API design of PyOD. See examples below for detecting anomalies with GNN in 5 lines!

PyGOD is under actively developed and will be updated frequently! Please star, watch, and fork.

PyGOD is featured for:

  • Unified APIs, detailed documentation, and interactive examples across various graph-based algorithms.

  • Comprehensive coverage of more than 10 latest graph neural networks (GNNs).

  • Full support of detections at multiple levels, such as node-, edge-, and graph-level tasks (WIP).

  • Streamline data processing with PyG–fully compatible with PyG data objects.

Outlier Detection Using GNN with 5 Lines of Code:

# train a dominant detector
from pygod.models import DOMINANT

model = DOMINANT()  # hyperparameters can be set here
model.fit(data)  # data is a Pytorch Geometric data object

# get outlier scores on the input data
outlier_scores = model.decision_scores # raw outlier scores on the input data

# predict on the new data
outlier_scores = model.decision_function(test_data) # raw outlier scores on the input data  # predict raw outlier scores on test

Citing PyGOD (to be updated soon):

PyGOD paper is available on arxiv and under review. If you use PyGOD in a scientific publication, we would appreciate citations to the following paper:

@article{tbd,
  author  = {tbd},
  title   = {PyGOD: A Comprehensive Python Library for Graph Outlier Detection},
  journal = {tbd},
  year    = {2022},
  url     = {tbd}
}

or:

tbd, tbd and tbd, 2022. PyGOD: A Comprehensive Python Library for Graph Outlier Detection. tbd.

Installation

It is recommended to use pip or conda (wip) for installation. Please make sure the latest version is installed, as PyGOD is updated frequently:

pip install pygod            # normal install
pip install --upgrade pygod  # or update if needed

Alternatively, you could clone and run setup.py file:

git clone https://github.com/pygod-team/pygod.git
cd pygod
pip install .

Required Dependencies:

  • Python 3.6 +

  • argparse>=1.4.0

  • numpy>=1.19.4

  • scikit-learn>=0.22.1

  • networkx>=2.6.3

  • scipy>=1.5.2

  • pandas>=1.1.3

  • setuptools>=50.3.1.post20201107

Note and PyG and PyTorch Installation: PyGOD depends on PyTorch Geometric (PyG), PyTorch, and networkx. To streamline the installation, PyGOD does NOT install these libraries for you. Please install them from the above links for running PyGOD:

  • torch>=1.10

  • pytorch_geometric>=2.0.3

  • networkx>=2.6.3


API Cheatsheet & Reference

Full API Reference: (https://pygod.readthedocs.io/en/latest/pygod.html). API cheatsheet for all detectors:

  • fit(X): Fit detector.

  • decision_function(G): Predict raw anomaly score of PyG data G using the fitted detector.

  • predict(G): Predict if nodes in PyG data G is an outlier or not using the fitted detector.

  • predict_proba(G): Predict the probability of nodes in PyG data G being outlier using the fitted detector.

  • predict_confidence(G): Predict the model’s node-wise confidence (available in predict and predict_proba) [8].

Key Attributes of a fitted model:

  • decision_scores_: The outlier scores of the training data. The higher, the more abnormal. Outliers tend to have higher scores.

  • labels_: The binary labels of the training data. 0 stands for inliers and 1 for outliers/anomalies.

Implemented Algorithms

PyOD toolkit consists of three major functional groups:

(i) Node-level detection :

Type

Abbr

Algorithm

Year

Ref

GNN

Dominant

Deep anomaly detection on attributed networks

2019

[3]

GNN

AnomalyDAE

AnomalyDAE: Dual autoencoder for anomaly detection on attributed networks

2020

[5]

GNN

DONE

Outlier Resistant Unsupervised Deep Architectures for Attributed Network Embedding

2020

[2]

GNN

AdONE

Outlier Resistant Unsupervised Deep Architectures for Attributed Network Embedding

2020

[2]

GNN

coLA

Anomaly Detection on Attributed Networks via Contrastive Self-Supervised Learning

2021

[7]

GNN

GCNAE

Variational Graph Auto-Encoders

2021

[10]

GNN

MLPAE (change ref)

Higher-order Structure Based Anomaly Detection on Attributed Networks

2021

[10]

GNN

GUIDE

Higher-order Structure Based Anomaly Detection on Attributed Networks

2021

[10]

GNN

OCGNN

One-Class Graph Neural Networks for Anomaly Detection in Attributed Networks

2021

[9]

GNN

ONE

Outlier aware network embedding for attributed networks

2019

[1]

(ii) Utility functions :

Type

Name

Function

Documentation

Metric

eval_roc_auc

ROC-AUC score for binary classification.

eval_roc_auc

Data

gen_structure_outliers

Generating structural outliers

gen_structure_outliers

Data

gen_attribute_outliers

Generating attribute outliers

gen_attribute_outliers

Data

gen_combined_outliers

Generating combined outliers

gen_combined_outliers


Quick Start for Outlier Detection with PyGOD

“examples/dominant_example.py” demonstrates the basic API of using the dominant detector. It is noted that the API across all other algorithms are consistent/similar.

More detailed instructions for running examples can be found in examples directory.

  1. Initialize a dominant detector, fit the model, and make the prediction.

  2. Evaluate the prediction by ROC and Precision @ Rank n (p@n).


How to Contribute

You are welcome to contribute to this exciting project:

See contribution guide for more information.


PyGOD Team

PyGOD is a great team effort by researchers from UIC, IIT, BUAA, ASU, and CMU. Our core team members include:

Kay Liu (UIC), Yingtong Dou (UIC), Yue Zhao (CMU), Xueying Ding (CMU), Xiyang Hu (CMU), Ruitong Zhang (BUAA), Kaize Ding (ASU), Canyu Chen (IIT),

Reach out us by submitting an issue report or email us at <tba>add an email<tba>


Reference

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pygod-0.1.0.tar.gz (38.2 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page