Skip to main content

Algorithm for finding anomalous groups in networks

Project description

Unit Test & Deploy

Python package for the CItation-Donor-REcipient (CIDRE) algorithm.

CIDRE is an algorithm to find anomalous groups in directed and weighted networks. An anomalous group consists of donor and recipient nodes, connected by edges with excessive edge weights (i.e., excessive edges). A donor is a node providing excessive edges to other member nodes. A recipient is a node having excessive edges from other member nodes.

If you use this package, please cite:

@misc{kojaku2021cartel,
      title={Detecting citation cartels in journal networks},
      author={Sadamori Kojaku and Giacomo Livan and Naoki Masuda},
      year={2021},
      eprint={2009.09097},
      archivePrefix={arXiv},
      primaryClass={physics.soc-ph}
}

Requirements

  • Python 3.7 or later

Install

pip install cidre

Examples

A minimal example

import cidre

alg = cidre.Cidre(group_membership)
groups = alg.detect(A, threshold = 0.15)
  • group_membership (Optional): If the network has communities, and the communities are not anomalous, tell the communities to CIDRE with this argument. group_membership should be numpy.array or list with element, group_membership[i], indicating the group to which node i belongs. Otherwise, set group_membership=None.
  • A: Adjacency matrix of the input network (can be weighted or directed). Should be either an nx.Graph or scipy.sparse_csr_matrix. In case of scipy.sparse_csr_matrix format, A[i,j] indicates the weight of the edge from node i to j.
  • threshold: Threshold for the donor and recipient nodes. A larger threshold will yield tighter and smaller groups.
  • groups: List of Group instances. See Group class section.

Group class

Group is a dedicated class for groups of donor and recipient nodes.

The donor and recipient nodes of a group, denoted by group, can be obtained by

group.donors # {node_id: donor_score}
group.recipients # {node_id: recipient_score}
  • group.donors is a dict object taking keys and values corresponding to the node IDs and the donor scores, respectively.
  • group.recipients is the analogous dict object for the recipient nodes.

The number of nodes and edges within the group can be obtained by

group.size() # Number of nodes
group.get_within_edges() # Number of edges within this group

Visualization

cidre package provides an API to visualize small groups. Before using this API, set up your canvas by

import matplotlib.pyplot as plt

width, height = 7, 10
fig, ax = plt.subplots((width, height))

Then, pass ax together with group that you want to visualize to DrawGroup class:

import cidre
dc = cidre.DrawGroup()
dc.draw(group, ax = ax)

This will show a plot like this:

  • The left and right nodes correspond to the donor and recipients nodes, respectively.
  • The color of each edge corresponds to the color of the source node (i.e., the node from which the edge emanates).
  • The width of each edge is proportional to the weight of the edge.
  • The text next to each node corresponds to the ID of the node, or equivalently the row id of the adjacency matrix A.

Instead of node IDs, you may want to display the nodes' labels. To do this, prepare a dict object taking IDs and labels as keys and values, respectively, e.g.,

node_labels = {0:"name 1", 1:"name 2"}

Then, pass it to the draw function as node_labels argument, i.e.,

dc.draw(group, node_labels = node_labels, ax = ax)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

cidre-0.0.1-py3-none-any.whl (16.6 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page