Inductive Node-Splitting Cross-Validation for community detection in networks
Project description
incv-community-detection
Inductive Node-Splitting Cross-Validation (INCV) for community detection in network data.
Overview
This package implements Inductive Node-Splitting Cross-Validation (INCV) for selecting the number of communities in Stochastic Block Models (SBMs). It also provides competing methods — Edge Cross-Validation (ECV) and Node Cross-Validation (NCV) — for comprehensive model selection in network analysis.
Key features
- INCV (f-fold and random split): Node-level cross-validation that splits nodes into folds, fits spectral clustering on training nodes, infers held-out node communities, and evaluates via negative log-likelihood and MSE.
- ECV: Edge holdout cross-validation for blockmodel selection.
- NCV: Node holdout cross-validation for blockmodel selection.
- Network simulation: Generators for SBM and planted-partition models.
- Real-data applications: Data loaders for International Trade and 108th U.S. Senate datasets.
- Visualization: CV loss curve plots, network graphs with community coloring.
Installation
# From PyPI
pip install incv-community-detection
# Or install from GitHub
pip install git+https://github.com/ivylinzhang97/incv-community-detection.git
# With optional network plotting support
pip install "incv-community-detection[network]"
Quick start
Simulate a network and select K with INCV
import numpy as np
from incv import community_sim, nscv_f_fold
rng = np.random.default_rng(42)
membership, A = community_sim(k=3, n=300, n1=60, p=0.3, q=0.1, rng=rng)
# Run 10-fold INCV
result = nscv_f_fold(A, k_vec=list(range(2, 8)), f=10, rng=rng)
print(f"Selected K (NLL): {result['k_loss']}")
print(f"Selected K (MSE): {result['k_mse']}")
Compare with ECV and NCV
from incv import ecv_block, ncv_select
ecv = ecv_block(A, max_K=6, B=5, rng=rng)
print(f"ECV model: {ecv['l2_model']}")
ncv = ncv_select(A, max_K=6, cv=3, rng=rng)
print(f"NCV model: {ncv['l2_model']}")
Plot CV loss curves
from incv import plot_cv_loss
plot_cv_loss(
list(range(2, 8)), result["cv_loss"], result["cv_mse"],
k_best_loss=result["k_loss"], k_best_mse=result["k_mse"],
save_path="cv_loss.png",
)
Package structure
| Module | Contents |
|---|---|
incv.core |
community_sim(), community_sim_sbm(), sbm_spectral_clustering(), sbm_prob(), nscv_f_fold(), nscv_random_split() |
incv.competitors |
ecv_block(), ncv_select() |
incv.simulations |
sim_folds(), sim_community(), sim_compare() |
incv.applications |
load_trade_data(), load_senate_data() |
incv.plotting |
plot_cv_loss(), plot_network(), plot_fold_comparison(), plot_community_comparison() |
Dependencies
Required: numpy, scipy, scikit-learn, pandas, matplotlib
Optional (for network plots): networkx
License
MIT
Citation
If you use this package in your research, please cite the paper on Inductive Node-Splitting Cross-Validation in Networks.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file incv_community_detection-0.1.0.tar.gz.
File metadata
- Download URL: incv_community_detection-0.1.0.tar.gz
- Upload date:
- Size: 1.5 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
35ea00963843ba2ccbe21658c779b0e4b2272ed211a08aa362a4d31e373141a4
|
|
| MD5 |
527490c7298efca0090b86e51246cd76
|
|
| BLAKE2b-256 |
f61f9a9bcf4d09737cd2f843806c1ade6ed62b174a9a2f2628ccf77c6c631008
|
File details
Details for the file incv_community_detection-0.1.0-py3-none-any.whl.
File metadata
- Download URL: incv_community_detection-0.1.0-py3-none-any.whl
- Upload date:
- Size: 1.5 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e6275c4cf62e477332f14ef87e132327596a591183ec431d81ef948654fd3aef
|
|
| MD5 |
e44eba8bb71816c5b60112be61ce2e20
|
|
| BLAKE2b-256 |
36cefb594e0a3c02a81b0de701bdf917b94273a515bb0a4c8ca5062d8b1c0c91
|