Skip to main content

Python bindings for the Surprise network community structure metric

Project description

pySurprise

Python package for computing the Surprise metric and running community detection algorithms on complex networks.

Surprise quantifies the quality of a partition of a complex network into communities using a cumulative hypergeometric distribution. Higher values indicate a more surprising (i.e. better) community structure.

pySurprise wraps the original C++ Surprise computation via pybind11 for maximum performance, and provides Python access to 7 clustering algorithms from the SurpriseMe project.

Installation

pip install pysurprise

To use the clustering algorithms that depend on optional packages:

# Install all algorithm dependencies
pip install pysurprise[algorithms]

# Or individually
pip install pysurprise[infomap]   # for Infomap
pip install pysurprise[igraph]    # for RN

From source

git clone https://github.com/raldecoa/SurpriseMe.git
cd SurpriseMe
pip install .

Quick start

import pysurprise

# Define edges (undirected, listed once)
edges = [(0, 1), (0, 2), (0, 3), (1, 2), (1, 3), (2, 3),
         (3, 4), (4, 5), (4, 6), (4, 7), (5, 6), (5, 7), (6, 7)]

# Partition: node i belongs to community partition[i]
partition = [0, 0, 0, 0, 1, 1, 1, 1]

value = pysurprise.surprise(edges, partition)
print(f"Surprise = {value}")

Using string labels

edges = [("a", "b"), ("a", "c"), ("b", "c"),
         ("c", "d"), ("d", "e"), ("d", "f"), ("e", "f")]

partition = {"a": 0, "b": 0, "c": 0, "d": 1, "e": 1, "f": 1}

value = pysurprise.surprise(edges, partition)

Low-level access

If you already have the four Surprise parameters (F, M, n, p):

from pysurprise import compute_surprise

F = 28.0   # total possible pairs: N*(N-1)/2
M = 12.0   # intra-community possible pairs
n = 13.0   # total edges
p = 12.0   # intra-community edges

value = compute_surprise(F, M, n, p)

Clustering algorithms

pysurprise.algorithms exposes the 7 community detection algorithms from SurpriseMe, ready to use from Python:

from pysurprise import algorithms
import pysurprise

# Run a single algorithm
partition = algorithms.cpm(edges)
score = pysurprise.surprise(edges, partition)

# Run all 7 algorithms and compare
results = algorithms.run_all(edges)
for name, part in results.items():
    s = pysurprise.surprise(edges, part)
    print(f"{name}: {s:.4f}")
Algorithm Function Reference
CPM algorithms.cpm() Traag et al., Phys. Rev. E 84 (2011)
Infomap algorithms.infomap() Rosvall & Bergstrom, PNAS 105 (2008)
RB algorithms.rb() Reichardt & Bornholdt, Phys. Rev. E 74 (2006)
RN algorithms.rn() Ronhovde & Nussinov, Phys. Rev. E 81 (2010)
RNSC algorithms.rnsc() King et al., Bioinformatics 20 (2004)
SCluster algorithms.scluster() Aldecoa & Marín, PLoS ONE 5 (2010)
UVCluster algorithms.uvcluster() Arnau et al., Bioinformatics 21 (2005)

All functions accept edges: list[(str, str)] and return dict[str, int] (node → community).

API Reference

Function Description
surprise(edges, partition) High-level: compute Surprise from edges and partition (list or dict)
compute_surprise(F, M, n, p) Low-level: compute Surprise from the four parameters directly
surprise_from_partition(edges, partition) C++ binding: int-indexed edges and partition list
surprise_from_partition_dict(edges, partition) C++ binding: string-labelled edges and partition dict
log_hyper_probability(F, M, n, j) Single hypergeometric probability term (log10)
algorithms.run_all(edges) Run all 7 clustering algorithms and return results

Parameters

  • F: Total number of possible node pairs = N×(N−1)/2
  • M: Sum of intra-community possible pairs = Σ s_i×(s_i−1)/2
  • n: Total number of edges in the network
  • p: Number of intra-community edges

Credits

Developed by José Marín.

Based on the original Surprise metric and SurpriseMe software by:

Aldecoa R, Marín I (2011). Deciphering network community structure by Surprise. PLoS ONE 6(9): e24195.

Aldecoa R, Marín I (2013). Surprise maximization reveals the community structure of complex networks. Scientific Reports 3, 1060.

The C++ Surprise computation and bundled clustering algorithms originate from the SurpriseMe project by Rodrigo Aldecoa and Ignacio Marín.

License

GPL-3.0-or-later. See LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pysurprise-1.1.0.tar.gz (206.0 kB view details)

Uploaded Source

File details

Details for the file pysurprise-1.1.0.tar.gz.

File metadata

  • Download URL: pysurprise-1.1.0.tar.gz
  • Upload date:
  • Size: 206.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.2

File hashes

Hashes for pysurprise-1.1.0.tar.gz
Algorithm Hash digest
SHA256 07f10e12c3300647c1493f85170628c71cbf48cbea3f622dc2cbafbbc7215833
MD5 fa8dcf35579f87502c30a94c3b2f8147
BLAKE2b-256 07a1331fe2a34dfd99fb197a9d1a5789dd1ffc2c622ab9c8af254c5ec2e74adc

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page