Skip to main content

Inductive Node-Splitting Cross-Validation for community detection in networks

Project description

incv-community-detection

Inductive Node-Splitting Cross-Validation (INCV) for community detection in network data.

Overview

This package implements Inductive Node-Splitting Cross-Validation (INCV) for selecting the number of communities in Stochastic Block Models (SBMs). It also provides competing methods — Edge Cross-Validation (ECV) and Node Cross-Validation (NCV) — for comprehensive model selection in network analysis.

Key features

  • INCV (f-fold and random split): Node-level cross-validation that splits nodes into folds, fits spectral clustering on training nodes, infers held-out node communities, and evaluates via negative log-likelihood and MSE.
  • ECV: Edge holdout cross-validation for blockmodel selection.
  • NCV: Node holdout cross-validation for blockmodel selection.
  • Network simulation: Generators for SBM and planted-partition models.
  • Real-data applications: Data loaders for International Trade and 108th U.S. Senate datasets.
  • Visualization: CV loss curve plots, network graphs with community coloring.

Installation

# From PyPI
pip install incv-community-detection

# Or install from GitHub
pip install git+https://github.com/ivylinzhang97/incv-community-detection.git

# With optional network plotting support
pip install "incv-community-detection[network]"

Quick start

Simulate a network and select K with INCV

import numpy as np
from incv import community_sim, nscv_f_fold

rng = np.random.default_rng(42)
membership, A = community_sim(k=3, n=300, n1=60, p=0.3, q=0.1, rng=rng)

# Run 10-fold INCV
result = nscv_f_fold(A, k_vec=list(range(2, 8)), f=10, rng=rng)
print(f"Selected K (NLL): {result['k_loss']}")
print(f"Selected K (MSE): {result['k_mse']}")

Compare with ECV and NCV

from incv import ecv_block, ncv_select

ecv = ecv_block(A, max_K=6, B=5, rng=rng)
print(f"ECV model: {ecv['l2_model']}")

ncv = ncv_select(A, max_K=6, cv=3, rng=rng)
print(f"NCV model: {ncv['l2_model']}")

Plot CV loss curves

from incv import plot_cv_loss

plot_cv_loss(
    list(range(2, 8)), result["cv_loss"], result["cv_mse"],
    k_best_loss=result["k_loss"], k_best_mse=result["k_mse"],
    save_path="cv_loss.png",
)

Package structure

Module Contents
incv.core community_sim(), community_sim_sbm(), sbm_spectral_clustering(), sbm_prob(), nscv_f_fold(), nscv_random_split()
incv.competitors ecv_block(), ncv_select()
incv.simulations sim_folds(), sim_community(), sim_compare()
incv.applications load_trade_data(), load_senate_data()
incv.plotting plot_cv_loss(), plot_network(), plot_fold_comparison(), plot_community_comparison()

Dependencies

Required: numpy, scipy, scikit-learn, pandas, matplotlib

Optional (for network plots): networkx

License

MIT

Citation

If you use this package in your research, please cite the paper on Inductive Node-Splitting Cross-Validation in Networks.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

incv_community_detection-0.1.0.tar.gz (1.5 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

incv_community_detection-0.1.0-py3-none-any.whl (1.5 MB view details)

Uploaded Python 3

File details

Details for the file incv_community_detection-0.1.0.tar.gz.

File metadata

File hashes

Hashes for incv_community_detection-0.1.0.tar.gz
Algorithm Hash digest
SHA256 35ea00963843ba2ccbe21658c779b0e4b2272ed211a08aa362a4d31e373141a4
MD5 527490c7298efca0090b86e51246cd76
BLAKE2b-256 f61f9a9bcf4d09737cd2f843806c1ade6ed62b174a9a2f2628ccf77c6c631008

See more details on using hashes here.

File details

Details for the file incv_community_detection-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for incv_community_detection-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e6275c4cf62e477332f14ef87e132327596a591183ec431d81ef948654fd3aef
MD5 e44eba8bb71816c5b60112be61ce2e20
BLAKE2b-256 36cefb594e0a3c02a81b0de701bdf917b94273a515bb0a4c8ca5062d8b1c0c91

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page