Skip to main content

Python library to calculate Robustness modularity of networks

Project description

Robustness Modularity

Small utility to calculate the robustness modularity, information modularity and modularity difference.

License

The software is distributed under the MIT license.

Authors

Dependencies

Robustness Modularity requires the following dependencies:

Except for graph-tool all the other packages can be installed using pip:

pip install -r requirements.txt

To install graph-tool follow the instructions in the graph-tool documentation.

We recommend to install it using conda:

conda install -c conda-forge graph-tool

Installation

After installing graph-tool dependency, the tool can be installed using pip:

pip install RModularity

or from source:

pip git+https://github.com/filipinascimento/RModularity.git

Usage

We provide three networks for testing in the SampleNetworks folder. A full usage example can be found in example.py.

First import the RModularity module:

import RModularity

For this example we will be using igraph to load the sample networks and pathlib to deal with the paths:

import igraph as ig
from pathlib import Path

When multiprocessing is enabled, the all the calculations need to be done in the main process, thus use if __name__ == '__main__':. Let's load a network from the SampleNetworks folder and define some paths:

if __name__ == '__main__':
    networkName = "road-euroroad"
    # networkName = "LFR_mu0.1"
    # networkName = "LFR_mu1.0"
    
    outputSuffix = ""
    figurePath = Path("Figures")
    figurePath.mkdir(parents=True, exist_ok=True)

    g = ig.Graph.Read_GML(str(Path("SampleNetworks")/("%s.gml" % networkName)))

You can calculate the approximated robustness modularity using the RModularityFast function, which implements the fast Monte-Carlo algorithm.

    Q_rA = RModularity.RModularityFast(
        g.vcount(), # Number of nodes
        g.get_edgelist(), # Edges list
        g.is_directed(), # Directed or not
        )
    print("Q_rA = ", Q_rA)

You can use the RModularity function to calculate the robustness modularity without approximations:

    Q_r, probabilities, TPRCurve, \
     DLCurvesTrivial, DLCurvesDetected = RModularity.RModularity(
         g.vcount(), # Number of nodes
         g.get_edgelist(), # Edges list
         g.is_directed(), # Directed or not
         outputCurves=True,
         )

    print("Q_r = ", Q_r)

By setting outputCurves to True, the Trivial Partition Ratio (TPR) and the description lengths of the detected and trivial partitions will be returned.

Modularity difference (Q_diff) can be calculated using the modularityDifference function:

    Q_diff = RModularity.modularityDifference(
        g.vcount(),
        g.get_edgelist(),
        g.is_directed()
    )

Information modularity can be calculated using the informationModularity function:

    Q_DL = RModularity.informationModularity(
        g.vcount(),
        g.get_edgelist(),
        g.is_directed()
    )
    print("Q_DL = ", Q_DL)

Here we also illustrate how to generate the TPR and Description lengths plots. First let's import a few extra packages

    import numpy as np
    import matplotlib.pyplot as plt
    import matplotlib.patches as mpl_patches

Let's calculate average and std for the curves:

    avgDLCurvesTrivial = np.mean(DLCurvesTrivial, axis=1)
    avgDLCurvesDetected = np.mean(DLCurvesDetected, axis=1)
    stdDLCurvesTrivial = np.std(DLCurvesTrivial, axis=1)
    stdDLCurvesDetected = np.std(DLCurvesDetected, axis=1)

    diffDLCurves = (DLCurvesTrivial-DLCurvesDetected) / DLCurvesTrivial
    avgDiffDLCurves = np.mean(diffDLCurves, axis=1)
    stdDiffDLCurves = np.std(diffDLCurves, axis=1)

Now let's plot the TPR curve:

    fig = plt.figure(figsize=(3*1.61803398875, 3))
    ax = plt.axes((0.2, 0.2, 0.70, 0.70), facecolor='w')
    nodeCount = g.vcount()
    averageDegree = np.mean(g.degree())
    TPRArea = Q_r
    trivialRatios = TPRCurve
    ax.plot(probabilities, trivialRatios, color="#262626", lw=2.0)
    ax.fill_between(probabilities, trivialRatios, 1, color="#E8EAEA")
    ax.set_xlabel("$p$")
    ax.set_ylabel("TPR")
    ax.set_title(networkName)
    ax.set_xlim(-0.00, 1.02)
    ax.set_ylim(-0.020, 1.020)
    handles = [mpl_patches.Rectangle((0, 0), 1, 1, fc="white", ec="white",
                                     lw=0, alpha=0)] * 3
    labels = []
    labels.append("$N$ = %d" % nodeCount)
    labels.append("$\\langle k\\rangle$ = %.2f" % averageDegree)
    labels.append("$Q_{r}$ = %.2f" % TPRArea)

    ax.legend(handles, labels, loc='best',
              fancybox=False, framealpha=0,
              handlelength=0, handletextpad=0)
    ax.spines['right'].set_visible(False)
    ax.spines['top'].set_visible(False)
    for axis in ['bottom', 'left']:
        ax.spines[axis].set_linewidth(1.5)
    ax.tick_params(width=1.5)
    fig.savefig(figurePath/("TPR_%s%s.pdf" % (networkName, outputSuffix)))
    plt.close(fig)

Now let's plot the description length curve:

    fig = plt.figure(figsize=(3*1.61803398875, 3))
    ax = plt.axes((0.2, 0.2, 0.70, 0.70), facecolor='w')
    ax.plot(probabilities, avgDLCurvesDetected, lw=2.0, label="Detected")
    ax.fill_between(probabilities, avgDLCurvesDetected-stdDLCurvesDetected,
                    avgDLCurvesDetected+stdDLCurvesDetected, alpha=0.2)

    ax.plot(probabilities, avgDLCurvesTrivial, lw=2.0, label="Trivial")
    ax.fill_between(probabilities, avgDLCurvesTrivial-stdDLCurvesTrivial,
                    avgDLCurvesTrivial+stdDLCurvesTrivial, alpha=0.2)

    ax.set_xlabel("$p$")
    ax.set_ylabel("DL")
    ax.set_title(networkName)
    ax.set_xlim(-0.00, 1.02)

    ax.legend()
    ax.spines['right'].set_visible(False)
    ax.spines['top'].set_visible(False)
    for axis in ['bottom', 'left']:
        ax.spines[axis].set_linewidth(1.5)
    ax.tick_params(width=1.5)
    fig.savefig(figurePath/("DL_%s%s.pdf" % (networkName, outputSuffix)))
    plt.close(fig)

And finally, let's plot the information modularity along p:

    fig = plt.figure(figsize=(3*1.61803398875, 3))
    ax = plt.axes((0.2, 0.2, 0.70, 0.70), facecolor='w')
    ax.plot(probabilities, avgDiffDLCurves, lw=2.0, label="Detected")
    ax.fill_between(probabilities, avgDiffDLCurves-stdDiffDLCurves,
                    avgDiffDLCurves+stdDiffDLCurves, alpha=0.2)

    ax.set_xlabel("$p$")
    ax.set_ylabel("$Q_\mathrm{DL}$")
    ax.set_title(networkName)
    ax.set_xlim(-0.00, 1.02)

    ax.spines['right'].set_visible(False)
    ax.spines['top'].set_visible(False)
    for axis in ['bottom', 'left']:
        ax.spines[axis].set_linewidth(1.5)
    ax.tick_params(width=1.5)
    fig.savefig(figurePath/("DLDiff_%s%s.pdf" % (networkName, outputSuffix)))
    plt.close(fig)

Please refer to the next section for more details on how to use this library.

Full API documentation

function RModularityFast

RModularity(
    nodeCount,
    edges,
    directed=False,
    perturbationCount=48,
    detectionTrials=1,
    showProgress=True,
    useMultiprocessing=True,
    useCoarseStep = True,
    fineError=0.01,
    coarseError = 0.02,
    minSimilarTrials=2,
)

Computes the approximated Robustness Modularity of a network using a Monte-Carlo approach. Note that this approach can not procude the curves of TPR.

Parameters

  • nodeCount : int
    The number of nodes in the network.
  • edges : list of tuples
    A list of the edges in the network.
  • directed : int, optional
    Whether the network is directed or not.
  • perturbationCount : int, optional
    The number of perturbations to perform. (defaults to 25)
  • detectionTrials : int, optional
    The number of times to perform community detection for each perturbed network. (defaults to 1)
  • rewireResolution : int, optional
    The number values points for the rewire probabilities (from 0 to 1) to calculate the Trivial Partition Ratio (TPR) curves and Robustness Modularity. (defaults to 51)
  • showProgress : bool, optional
    Shows a progress bar if enabled. (defaults to True)
  • useMultiprocessing: bool, optional
    Uses parallel processing to calculate Rmodularity. (defaults to True)
  • useCoarseStep: bool, optional Finds the plateal region using a binary search before applying the Monte-Carlo approach. (defaults to True)
  • fineError: float, optional Error tolerance for the fine step. (defaults to 0.01)
  • coarseError: float, optional Error tolerance for the coarse step. (defaults to 0.02)
  • minSimilarTrials: int, optional The minimum number of similar trials to perform before stopping the Monte-Carlo approach.(defaults to 2)

Returns

  • float if outputCurves is False
    The Robustness Modularity of the network.

function RModularity

RModularity(
    nodeCount,
    edges,
    directed=False,
    perturbationCount=25,
    detectionTrials=1,
    rewireResolution=51,
    outputCurves=False,
    showProgress=True,
    useMultiprocessing=True
)

Computes the Robustness Modularity of a network.

Parameters

  • nodeCount : int
    The number of nodes in the network.
  • edges : list of tuples
    A list of the edges in the network.
  • directed : int, optional
    Whether the network is directed or not.
  • perturbationCount : int, optional
    The number of perturbations to perform. (defaults to 25)
  • detectionTrials : int, optional
    The number of times to perform community detection for each perturbed network. (defaults to 1)
  • rewireResolution : int, optional
    The number values points for the rewire probabilities (from 0 to 1) to calculate the Trivial Partition Ratio (TPR) curves and Robustness Modularity.(defaults to 51)
  • outputCurves : bool, optional
    Whether to save the TPR and DL curves. (defaults to False)
  • showProgress : bool, optional
    Shows a progress bar if enabled. (defaults to True)
  • useMultiprocessing: bool, optional
    Uses parallel processing to calculate Rmodularity. (defaults to True)

Returns

  • float if outputCurves is False
    The Robustness Modularity of the network.
  • (float, np.array dim=1, np.array dim=1, np.array dim=2, np.array dim=2) if outputCurves is True
    Returns a tuple of 4 values containing the Robustness Modularity, the rewire probabilities, the TPR curves, and the Description lenghts for the detected and trivial partitions.

function modularityDifference

modularityDifference(
    nodeCount,
    edges,
    directed=False,
    detectionTrials=100,
    nullmodelCount=100,
    detectionTrialsNullModel=10,
    showProgress=True,
    useMultiprocessing=True
)

Computes the Modularity Difference of a network.

Parameters

  • nodeCount : int
    The number of nodes in the network.
  • edges : list of tuples
    A list of the edges in the network.
  • directed : int, optional
    Whether the network is directed or not.
  • detectionTrials : int, optional
    The number of times to perform community detection for the input network (defaults to 100)
  • nullmodelCount : int, optional
    The number of realizations of the null-model (configuration model) used to calculate the null-model modularity. (defaults to 100)
  • detectionTrialsNullModel : int, optional
    The number of times to perform community detection for each perturbed network (defaults to 10)
  • showProgress : bool, optional
    Shows a progress bar if enabled. (defaults to True)
  • useMultiprocessing: bool, optional
    Uses parallel processing to calculate Rmodularity. (defaults to True)

Returns

  • float
    The Modularity Difference of the network.

function informationModularity

informationModularity(nodeCount, edges, directed=False)

Computes the Information Modularity of a network.

Parameters

  • nodeCount : int
    The number of nodes in the network.
  • edges : list of tuples
    A list of the edges in the network.
  • directed : int, optional
    Whether the network is directed or not.

Returns

  • float
    The Information Modularity of the network.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

RModularity-0.3.0.tar.gz (11.9 kB view details)

Uploaded Source

File details

Details for the file RModularity-0.3.0.tar.gz.

File metadata

  • Download URL: RModularity-0.3.0.tar.gz
  • Upload date:
  • Size: 11.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.1 pkginfo/1.8.2 requests/2.27.1 requests-toolbelt/0.9.1 tqdm/4.41.0 CPython/3.7.5

File hashes

Hashes for RModularity-0.3.0.tar.gz
Algorithm Hash digest
SHA256 0e9630c754dd0cf873004d72567253c3c00b68939e47efa7f71e7e018d2e9089
MD5 954fad15e34179f33c69d2b2fa927728
BLAKE2b-256 8252aa36b1c11e005441ad48184bfce920724165e0434470da42e7d101436e1a

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page