Skip to main content

Renewal Non-Backtracking Random Walk (RNBRW) for community detection

Project description

RNBRW

PyPI version

RNBRW (Renewal Non-Backtracking Random Walks) is a Python package for estimating edge-level importance in networks using random walks that restart upon cycle closure. These weights can be used to improve community detection algorithms like Louvain.

Based on:

Moradi, B., Shakeri, H., Poggi-Corradini, P., & Higgins, M.
A new method for incorporating network cyclic structures to improve community detection
arXiv:1805.07484


Installation

pip install rnbrw

Features

Parallelization & HPC Useage

rnbrw supports parallel execution of the RNBRW simulations using joblib. This allows for efficient simulation on multi-core machines or High-Performance Computing (HPC) clusters.

You can control parallel execution using the n_jobs parameter:

  • Local machine: Set n_jobs=-1 to use all available CPU cores, or specify the exact number of cores to use (e.g., n_jobs=4).

  • High-Performance Computing (HPC): When running on an HPC cluster, n_jobs can be tuned according to the allocated CPUs in your job script. For best performance, align n_jobs with the number of cores requested via sbatch, qsub, or your cluster’s job scheduler.

HPC Usage

RNBRW supports both single-walk and batched multi-walk execution on HPC clusters.

  • Use compute_weights(..., only_walk=True) for one walk per job (simple job arrays).
  • Use walk_hole_E for batched jobs (e.g. 300 walks per job), which guarantees independence without mutating the graph.

Step 1: Run RNBRW Walks in Parallel Jobs

Option A – One Walk Per Job (simple arrays)

SLURM Example Here’s a basic SLURM job array script:rnbrw_job.sh

#!/bin/bash
#SBATCH --job-name=rnbrw_walk
#SBATCH --output=logs/rnbrw_%A_%a.out
#SBATCH --error=logs/rnbrw_%A_%a.err
#SBATCH --array=0-19              # 20 total jobs
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=1
#SBATCH --mem=2G
#SBATCH --time=00:10:00

module load python/3.10
source activate rnbrw-env

# Run the Python script with job array index
python run_rnbrw_batch.py $SLURM_ARRAY_TASK_ID
import sys
import numpy as np
import networkx as nx
from rnbrw.weights import compute_weights

job_id = int(sys.argv[1])
seed = 1000 + job_id

# Load the graph (shared across all jobs)
G = nx.read_gpickle("mygraph.gpickle")

# Single walk using only_walk mode
G = compute_weights(G, nsim=1, seed=seed, only_walk=True)

# Extract edge counts
m = G.number_of_edges()
T = np.zeros(m)
for u, v in G.edges():
    T[G[u][v]['enum']] = G[u][v]['ret']

np.save(f"T_partial_{job_id}.npy", T)
 

Option A – Batched Walks Per Job (e.g. 300 walks/job)

import sys, pickle import numpy as np from rnbrw.weights.rnbrw import walk_hole_E # low-level primitive

job_id = int(sys.argv[1]) walks_per_job = 300 seeds = [1000 + job_id * walks_per_job + i for i in range(walks_per_job)]

Load the graph once

with open("mygraph.gpickle", "rb") as f: G = pickle.load(f)

m = G.number_of_edges() T = np.zeros(m)

for s in seeds: T += walk_hole_E(G, seed=s) # independent walks, no mutation

np.save(f"T_partial_{job_id}.npy", T)

Step 2 – Aggregate outputs (on head node):

import numpy as np

T_total = sum(np.load(f"T_partial_{i}.npy") for i in range(num_jobs))

Step 3 – Assign Weights to Graph

import networkx as nx
import numpy as np
from rnbrw.utils import assign_rnbrw_weights

G = nx.read_gpickle("mygraph.gpickle")
T_total = np.load("T_total.npy")

# Assign raw + normalized weights to the graph
G = assign_rnbrw_weights(G, T_total)

Step 4: Run Louvain

from rnbrw.community import detect_communities_louvain

partition = detect_communities_louvain(G, weight_attr='ret_n')

This makes rnbrw especially suitable for research environments where cycles and edge roles must be computed across very large networks.

Local Usage

Use compute_weights directly with multi-threading:

import networkx as nx
from rnbrw.weights import compute_weights
from rnbrw.community import detect_communities_louvain

# Create or load a graph
G = nx.karate_club_graph()

# Compute RNBRW weights
# Recommendation: nsim should be at least the number of edges in G
G = compute_weights(G, nsim=G.number_of_edges(), n_jobs=4)

# Edge weights (normalized)
weights = [G[u][v]['ret_n'] for u, v in G.edges()]

# Detect communities
from rnbrw.community import detect_communities_louvain

partition = detect_communities_louvain(G, weight_attr='ret_n')

API Reference

G_weighted = compute_weights(
    G,               # networkx.Graph
    nsim=None,       # Optional[int], defaults to factor * m (num edges)
    factor=1.0,      # float, multiplies number of edges to compute default nsim
    seed=None,       # Optional[int], random seed for reproducibility
    n_jobs=1,        # int, number of parallel jobs (-1 = all CPUs)
    init_weight=0.001,# float, initial placeholder edge weights
    only_walk=False  # bool, run single walk (no aggregation) for HPC
)

Simulates RNBRW on graph G to assign edge importance scores as weights.

Parameters for compute_weights

Parameter Type Default Description
G networkx.Graph required Input undirected graph.
nsim int or None None Number of RNBRW simulations. If None, it defaults to factor × m, where m is the number of edges.
factor float 1.0 Scaling factor to set nsim dynamically based on graph size (m).
n_jobs int 1 Number of parallel jobs (-1 uses all available CPUs).
seed int or None None Random seed for reproducibility. Each simulation is seeded with seed + i.
init_weight float 0.01 Initial placeholder weight for each edge before running RNBRW.
only_walk bool False If True, performs a single walk without aggregating weights (for HPC use).

Notes

  • Recommended: For stable and convergent RNBRW edge weights, set nsim approximately equal to m, the number of edges in the graph (e.g., nsim ≈ m).
  • If only_walk=True, the function will return the walk output without updating graph weights — useful for splitting across HPC batch jobs manually.
detect_communities_louvain(G, weight_attr='ret_n')

Runs Louvain on G using edge weights.

Parameter Type Description
G networkx.Graph Weighted input graph
weight_attr str Edge weight attribute used for Louvain (default = 'ret_n')

|

normalize_edge_weights(G, weight='ret')

Normalizes the weights to sum to 1 across all edges.

Parameter Type Description
G networkx.Graph Graph whose edge weights are to be normalized
weight str Edge attribute to normalize (default = 'ret')

If you are running RNBRW simulations in parallel (e.g., on an HPC cluster), you can aggregate the walk counts and assign RNBRW weights manually using the function below.

# After accumulating total walk counts across jobs:
G_weighted = assign_rnbrw_weights(G, T_total)
# G must have 'enum' attributes on edges
Parameter Type Description
G networkx.Graph Input graph with edges having 'enum' attribute for indexing
T_total np.ndarray Array of edge hit counts from RNBRW simulations (same order as 'enum')

Use this when you aggregate RNBRW cycle counts manually and want to assign weights post-hoc.

Citation

If you use this package in your research, please cite:

@article{moradi2018new, title={A new method for incorporating network cyclic structures to improve community detection}, author={Moradi, Behnaz and Shakeri, Heman and Poggi-Corradini, Pietro and Higgins, Michael}, journal={arXiv preprint arXiv:1805.07484}, year={2018} } Or use the “Cite this repository” button above.

License

This project is licensed under the MIT License © 2025 Behnaz Moradi-Jamei.

Documentation

Full documentation is available at Read the Docs.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rnbrw-0.1.6.3.tar.gz (10.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

rnbrw-0.1.6.3-py3-none-any.whl (9.0 kB view details)

Uploaded Python 3

File details

Details for the file rnbrw-0.1.6.3.tar.gz.

File metadata

  • Download URL: rnbrw-0.1.6.3.tar.gz
  • Upload date:
  • Size: 10.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.2

File hashes

Hashes for rnbrw-0.1.6.3.tar.gz
Algorithm Hash digest
SHA256 289096eafd7c8e46cdad6262be922eeb3b19009b2a5dc35a50b927e6d04bd199
MD5 37485e1d4387a8d309773764561e3ee8
BLAKE2b-256 5444f08f990b3723cf6a8e42849256239e25a69d509587413b2ee5d340a3482f

See more details on using hashes here.

File details

Details for the file rnbrw-0.1.6.3-py3-none-any.whl.

File metadata

  • Download URL: rnbrw-0.1.6.3-py3-none-any.whl
  • Upload date:
  • Size: 9.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.2

File hashes

Hashes for rnbrw-0.1.6.3-py3-none-any.whl
Algorithm Hash digest
SHA256 3dbdb857147df8ca964832a49a237cf449fef78df3085b66a7721f22ba36bb90
MD5 056ca7e84f5f9967a0acc407c6879338
BLAKE2b-256 bcc00f519f7ba2708859f1fb0d4139979738296fb26dd7925d078745e73f20f1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page