Skip to main content

Python bindings for the Surprise network community structure metric

Project description

pySurprise

Python package for computing the Surprise metric and running community detection algorithms on complex networks.

Surprise quantifies the quality of a partition of a complex network into communities using a cumulative hypergeometric distribution. Higher values indicate a more surprising (i.e. better) community structure.

pySurprise wraps the original C++ Surprise computation via pybind11 for maximum performance, and provides Python access to 7 clustering algorithms from the SurpriseMe project.

Installation

pip install pysurprise

To use the clustering algorithms that depend on optional packages:

# Install all algorithm dependencies
pip install pysurprise[algorithms]

# Or individually
pip install pysurprise[infomap]   # for Infomap
pip install pysurprise[igraph]    # for RN

From source

git clone https://github.com/raldecoa/SurpriseMe.git
cd SurpriseMe
pip install .

Quick start

import pysurprise

# Define edges (undirected, listed once)
edges = [(0, 1), (0, 2), (0, 3), (1, 2), (1, 3), (2, 3),
         (3, 4), (4, 5), (4, 6), (4, 7), (5, 6), (5, 7), (6, 7)]

# Partition: node i belongs to community partition[i]
partition = [0, 0, 0, 0, 1, 1, 1, 1]

value = pysurprise.surprise(edges, partition)
print(f"Surprise = {value}")

Using string labels

edges = [("a", "b"), ("a", "c"), ("b", "c"),
         ("c", "d"), ("d", "e"), ("d", "f"), ("e", "f")]

partition = {"a": 0, "b": 0, "c": 0, "d": 1, "e": 1, "f": 1}

value = pysurprise.surprise(edges, partition)

Low-level access

If you already have the four Surprise parameters (F, M, n, p):

from pysurprise import compute_surprise

F = 28.0   # total possible pairs: N*(N-1)/2
M = 12.0   # intra-community possible pairs
n = 13.0   # total edges
p = 12.0   # intra-community edges

value = compute_surprise(F, M, n, p)

Clustering algorithms

pysurprise.algorithms exposes the 7 community detection algorithms from SurpriseMe, ready to use from Python:

from pysurprise import algorithms
import pysurprise

# Run a single algorithm
partition = algorithms.cpm(edges)
score = pysurprise.surprise(edges, partition)

# Run all 7 algorithms and compare
results = algorithms.run_all(edges)
for name, part in results.items():
    s = pysurprise.surprise(edges, part)
    print(f"{name}: {s:.4f}")
Algorithm Function Reference
CPM algorithms.cpm() Traag et al., Phys. Rev. E 84 (2011)
Infomap algorithms.infomap() Rosvall & Bergstrom, PNAS 105 (2008)
RB algorithms.rb() Reichardt & Bornholdt, Phys. Rev. E 74 (2006)
RN algorithms.rn() Ronhovde & Nussinov, Phys. Rev. E 81 (2010)
RNSC algorithms.rnsc() King et al., Bioinformatics 20 (2004)
SCluster algorithms.scluster() Aldecoa & Marín, PLoS ONE 5 (2010)
UVCluster algorithms.uvcluster() Arnau et al., Bioinformatics 21 (2005)

All functions accept edges: list[(str, str)] and return dict[str, int] (node → community).

API Reference

Function Description
surprise(edges, partition) High-level: compute Surprise from edges and partition (list or dict)
compute_surprise(F, M, n, p) Low-level: compute Surprise from the four parameters directly
surprise_from_partition(edges, partition) C++ binding: int-indexed edges and partition list
surprise_from_partition_dict(edges, partition) C++ binding: string-labelled edges and partition dict
log_hyper_probability(F, M, n, j) Single hypergeometric probability term (log10)
algorithms.run_all(edges) Run all 7 clustering algorithms and return results

Parameters

  • F: Total number of possible node pairs = N×(N−1)/2
  • M: Sum of intra-community possible pairs = Σ s_i×(s_i−1)/2
  • n: Total number of edges in the network
  • p: Number of intra-community edges

Credits

Developed by José Marín.

Based on the original Surprise metric and SurpriseMe software by:

Aldecoa R, Marín I (2011). Deciphering network community structure by Surprise. PLoS ONE 6(9): e24195.

Aldecoa R, Marín I (2013). Surprise maximization reveals the community structure of complex networks. Scientific Reports 3, 1060.

The C++ Surprise computation and bundled clustering algorithms originate from the SurpriseMe project by Rodrigo Aldecoa and Ignacio Marín.

License

GPL-3.0-or-later. See LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pysurprise-1.0.0.tar.gz (47.1 kB view details)

Uploaded Source

File details

Details for the file pysurprise-1.0.0.tar.gz.

File metadata

  • Download URL: pysurprise-1.0.0.tar.gz
  • Upload date:
  • Size: 47.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.2

File hashes

Hashes for pysurprise-1.0.0.tar.gz
Algorithm Hash digest
SHA256 65d438f4b1a65811ed5ed84533f55a2c7301f51afac0c03e2647e11da2a42bb4
MD5 281acdc9df8e5ac34c7939ea463f6594
BLAKE2b-256 1bebc90fd8778769692ac506fa17272a0cc8d6cd8457eb24589bcba70d269090

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page