Skip to main content

Renewal Non-Backtracking Random Walk (RNBRW) for community detection

Project description

RNBRW

PyPI version

RNBRW (Renewal Non-Backtracking Random Walks) is a Python package for estimating edge-level importance in networks using random walks that restart upon cycle closure. These weights can be used to improve community detection algorithms like Louvain.

Based on:

Moradi, B., Shakeri, H., Poggi-Corradini, P., & Higgins, M.
A new method for incorporating network cyclic structures to improve community detection
arXiv:1805.07484


Installation

pip install rnbrw

Features

Parallelization & HPC Useage

rnbrw supports parallel execution of the RNBRW simulations using joblib. This allows for efficient simulation on multi-core machines or High-Performance Computing (HPC) clusters.

You can control parallel execution using the n_jobs parameter:

  • Local machine: Set n_jobs=-1 to use all available CPU cores, or specify the exact number of cores to use (e.g., n_jobs=4).

  • High-Performance Computing (HPC): When running on an HPC cluster, n_jobs can be tuned according to the allocated CPUs in your job script. For best performance, align n_jobs with the number of cores requested via sbatch, qsub, or your cluster’s job scheduler.

HPC Usage

RNBRW supports both single-walk and batched multi-walk execution on HPC clusters.

  • Use compute_weights(..., only_walk=True) for one walk per job (simple job arrays).
  • Use walk_hole_E for batched jobs (e.g. 300 walks per job), which guarantees independence without mutating the graph.

Step 1: Run RNBRW Walks in Parallel Jobs

Option A – One Walk Per Job (simple arrays)

SLURM Example Here’s a basic SLURM job array script:rnbrw_job.sh

#!/bin/bash
#SBATCH --job-name=rnbrw_walk
#SBATCH --output=logs/rnbrw_%A_%a.out
#SBATCH --error=logs/rnbrw_%A_%a.err
#SBATCH --array=0-19              # 20 total jobs
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=1
#SBATCH --mem=2G
#SBATCH --time=00:10:00

module load python/3.10
source activate rnbrw-env

# Run the Python script with job array index
python run_rnbrw_batch.py $SLURM_ARRAY_TASK_ID
import sys
import numpy as np
import networkx as nx
from rnbrw.weights import compute_weights

job_id = int(sys.argv[1])
seed = 1000 + job_id

# Load the graph (shared across all jobs)
G = nx.read_gpickle("mygraph.gpickle")

# Single walk using only_walk mode
G = compute_weights(G, nsim=1, seed=seed, only_walk=True)

# Extract edge counts
m = G.number_of_edges()
T = np.zeros(m)
for u, v in G.edges():
    T[G[u][v]['enum']] = G[u][v]['ret']

np.save(f"T_partial_{job_id}.npy", T)
 

Option B – Batched Walks Per Job (e.g. 300 walks/job)

import sys, pickle
import numpy as np
from rnbrw.weights.rnbrw import walk_hole_E   # low-level primitive

job_id = int(sys.argv[1])
walks_per_job = 300
seeds = [1000 + job_id * walks_per_job + i for i in range(walks_per_job)]

# Load the graph once
with open("mygraph.gpickle", "rb") as f:
    G = pickle.load(f)

m = G.number_of_edges()
T = np.zeros(m)

for s in seeds:
    T += walk_hole_E(G, seed=s)   # independent walks, no mutation

np.save(f"T_partial_{job_id}.npy", T)

Step 2 – Aggregate outputs (on head node):

import numpy as np

T_total = sum(np.load(f"T_partial_{i}.npy") for i in range(num_jobs))

Step 3 – Assign Weights to Graph

import networkx as nx
import numpy as np
from rnbrw.utils import assign_rnbrw_weights

G = nx.read_gpickle("mygraph.gpickle")
T_total = np.load("T_total.npy")

# Assign raw + normalized weights to the graph
G = assign_rnbrw_weights(G, T_total)

Step 4: Run Louvain

from rnbrw.community import detect_communities_louvain

partition = detect_communities_louvain(G, weight_attr='ret_n')

This makes rnbrw especially suitable for research environments where cycles and edge roles must be computed across very large networks.

Local Usage

Use compute_weights directly with multi-threading:

import networkx as nx
from rnbrw.weights import compute_weights
from rnbrw.community import detect_communities_louvain

# Create or load a graph
G = nx.karate_club_graph()

# Compute RNBRW weights
# Recommendation: nsim should be at least the number of edges in G
G = compute_weights(G, nsim=G.number_of_edges(), n_jobs=4)

# Edge weights (normalized)
weights = [G[u][v]['ret_n'] for u, v in G.edges()]

# Detect communities
from rnbrw.community import detect_communities_louvain

partition = detect_communities_louvain(G, weight_attr='ret_n')

API Reference

G_weighted = compute_weights(
    G,               # networkx.Graph
    nsim=None,       # Optional[int], defaults to factor * m (num edges)
    factor=1.0,      # float, multiplies number of edges to compute default nsim
    seed=None,       # Optional[int], random seed for reproducibility
    n_jobs=1,        # int, number of parallel jobs (-1 = all CPUs)
    init_weight=0.001,# float, initial placeholder edge weights
    only_walk=False  # bool, run single walk (no aggregation) for HPC
)

Simulates RNBRW on graph G to assign edge importance scores as weights.

Parameters for compute_weights

Parameter Type Default Description
G networkx.Graph required Input undirected graph.
nsim int or None None Number of RNBRW simulations. If None, it defaults to factor × m, where m is the number of edges.
factor float 1.0 Scaling factor to set nsim dynamically based on graph size (m).
n_jobs int 1 Number of parallel jobs (-1 uses all available CPUs).
seed int or None None Random seed for reproducibility. Each simulation is seeded with seed + i.
init_weight float 0.01 Initial placeholder weight for each edge before running RNBRW.
only_walk bool False If True, performs a single walk without aggregating weights (for HPC use).

Notes

  • Recommended: For stable and convergent RNBRW edge weights, set nsim approximately equal to m, the number of edges in the graph (e.g., nsim ≈ m).
  • If only_walk=True, the function will return the walk output without updating graph weights — useful for splitting across HPC batch jobs manually.
detect_communities_louvain(G, weight_attr='ret_n')

Runs Louvain on G using edge weights.

Parameter Type Description
G networkx.Graph Weighted input graph
weight_attr str Edge weight attribute used for Louvain (default = 'ret_n')

|

normalize_edge_weights(G, weight='ret')

Normalizes the weights to sum to 1 across all edges.

Parameter Type Description
G networkx.Graph Graph whose edge weights are to be normalized
weight str Edge attribute to normalize (default = 'ret')

If you are running RNBRW simulations in parallel (e.g., on an HPC cluster), you can aggregate the walk counts and assign RNBRW weights manually using the function below.

# After accumulating total walk counts across jobs:
G_weighted = assign_rnbrw_weights(G, T_total)
# G must have 'enum' attributes on edges
Parameter Type Description
G networkx.Graph Input graph with edges having 'enum' attribute for indexing
T_total np.ndarray Array of edge hit counts from RNBRW simulations (same order as 'enum')

Use this when you aggregate RNBRW cycle counts manually and want to assign weights post-hoc.

Citation

If you use this package in your research, please cite:

@article{moradi2018new, title={A new method for incorporating network cyclic structures to improve community detection}, author={Moradi, Behnaz and Shakeri, Heman and Poggi-Corradini, Pietro and Higgins, Michael}, journal={arXiv preprint arXiv:1805.07484}, year={2018} } Or use the “Cite this repository” button above.

License

This project is licensed under the MIT License © 2025 Behnaz Moradi-Jamei.

Documentation

Full documentation is available at Read the Docs.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rnbrw-0.1.6.4.tar.gz (10.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

rnbrw-0.1.6.4-py3-none-any.whl (9.0 kB view details)

Uploaded Python 3

File details

Details for the file rnbrw-0.1.6.4.tar.gz.

File metadata

  • Download URL: rnbrw-0.1.6.4.tar.gz
  • Upload date:
  • Size: 10.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.2

File hashes

Hashes for rnbrw-0.1.6.4.tar.gz
Algorithm Hash digest
SHA256 f0ec48a73cb62562b4b309d1730fd6ed5b06db1a4ce5b81e072738268362b4f3
MD5 69a9c40b6c77d59f19c730671aba90c2
BLAKE2b-256 055ab62df7b815566e7e8dd80869b6c4d4a5cc4ccbdffd42f885159456edbf0a

See more details on using hashes here.

File details

Details for the file rnbrw-0.1.6.4-py3-none-any.whl.

File metadata

  • Download URL: rnbrw-0.1.6.4-py3-none-any.whl
  • Upload date:
  • Size: 9.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.2

File hashes

Hashes for rnbrw-0.1.6.4-py3-none-any.whl
Algorithm Hash digest
SHA256 8e1cf74106bf43a1fa6d5aab98cd828e543db18192782e8a016fc24d61bf1333
MD5 004ad0622ba6e11a9afe1e96bb9cd544
BLAKE2b-256 1bbebcca01214b358c157b93a0375d1e10fd5105400d0991d24ba0d6c37e7466

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page