Skip to main content

Outlier detection algorithm for graph datasets

Project description

outgraph

outgraph is a simple outlier detection algorithm for graph datasets. Given a list of graphs, it uses Mahalanobis distance detect which graphs are outliers based on either their topology or node attributes.

Note: outgraph only works for datasets where each graph has an equal number of nodes.

Installation

You can install outgraph with pip:

$ pip install outgraph

How it Works

Unlike most approaches to graph outlier detection, outgraph does not use machine learning. Instead, each graph is converted into a vector representation using one of three available methods:

  1. Averaging the node feature/attribute vectors
  2. Flattening the adjacency matrix
  3. A concatenation of 1 and 2

Then, the Mahalanobis distance between each vector and the distribution of vectors is calculated. Lastly, a Chi-Squared distribution is used to model the distribution of distances and identify the distances outside a cutoff threshold (e.g. p < 0.05).

This approach is based off this article.

Usage

Each graph in your dataset needs to be an instance of outgraph.Graph. This object has two parameters, node_attrs and adjacency_matrix –– both numpy arrays where the indices correspond to nodes. Example:

import numpy as np
from outgraph import Graph

node_attrs = np.array([[-1], [0], [1]])
adj_matrix = np.array([[1, 1, 0],
                       [1, 1, 1],
                       [0, 1, 1]])
graph = Graph(node_attrs, adj_matrix)


Once you have a list of Graph objects, simply submit them into outgraph.detect_outliers:

from outgraph import Graph, detect_outliers

graphs = [Graph(), ...]
outliers, indices = detect_outliers(graphs, method=1, p_value=0.05)

Notice the method and p_value parameters. The method parameter is an integer between 1 and 3 that corresponds to one of the three graph vectorization methods described in the ![How it Works](##How it Works) section. p_value is the outlier cutoff threshold.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

outgraph-1.0.0.tar.gz (3.7 kB view details)

Uploaded Source

File details

Details for the file outgraph-1.0.0.tar.gz.

File metadata

  • Download URL: outgraph-1.0.0.tar.gz
  • Upload date:
  • Size: 3.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.0 CPython/3.9.12

File hashes

Hashes for outgraph-1.0.0.tar.gz
Algorithm Hash digest
SHA256 fef669301782d7d4faff566f65d745113c3bfdbb69571cf8e6158f18b51c7a95
MD5 0ff440ca00e0ef7e0dea902b83f5a7ed
BLAKE2b-256 a70613c3c47e446b31a682e7d5b44f4af3f5bcc7c3953229fcb6e2835072212d

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page