Python package for space-filling sampling of network nodes, designed to improve semi-supervised classification via label propagation.
Project description
NetworkSampler: smart networks sampling
NetworkSampler is a Python package for sampling network nodes using space-filling designs. It works by minimizing the geodesic distance of the sample in the network, and maximizing the coverage over the graph.
This package is particularly helpful with semi-supervised machine learning approaches: with this package you can select a sample of nodes to be manually labeled, to extend the classification all over the network via label propagation. The space-filling sampling strategy ensures that the selected seed nodes are well distributed across the network, leading to more accurate and reliable predictions compared to traditional sampling methods.
For more details check the publication:
del Gobbo E., Fontanella L., Ippoliti L., Di Zio S., Fontanella S., Cucco A. (2026). A space-filling sampling approach for collective classification of social media data. Advances in Data Analysis and Classification. https://doi.org/10.1007/s11634-026-00670-z
- Website (including documentation): https://edgresearch.github.io/pylib_networksampler
- Source: https://github.com/edgresearch/pylib_networksampler
- Bug reports: https://github.com/edgresearch/pylib_networksampler/issues
- Pip Package: https://pypi.org/project/networksampler
Tutorials
A comprehensive tutorial is available in the official documentation.
If you prefer a hands-on approach, you can find a complete Jupyter Notebook illustrating the library's features step-by-step directly in the repository. You can open and run it directly in your browser using Google Colab:
- Or view the source file here: notebooks/basic_tutorial.ipynb
Simple example
Extract a sample that maximizes the coverage of a network:
import networksampler
import networkx as nx
# Generate an adjacency matrix of 1000 nodes full connected
G = nx.gaussian_random_partition_graph(1000, 500, 100, 0.20, 0.1)
A = nx.adjacency_matrix(G).todense()
# Extract the sample:
# nodes number = 10
# p = -4 and q = 4
# r = 0.1
result = networksampler.sa_sampling(A, 10, -4, 4, 0.1)
print(result)
# (array([ 14, 135, 213, 256, 345, 560, 678, 690, 900, 967]),
# np.float64(7.727146854012883))
Install
Install the latest version of NetworkSampler:
$ pip install networksampler
Bugs
Please report any bugs that you find here. Or, even better, fork the repository on GitHub and create a pull request (PR). We welcome all changes, big or small, to improve the library performance.
How to cite this package in publications?
If you use this package in your research, please cite the following paper:
del Gobbo E., Fontanella L., Ippoliti L., Di Zio S., Fontanella S., Cucco A. (2026). A space-filling sampling approach for collective classification of social media data. Advances in Data Analysis and Classification. https://doi.org/10.1007/s11634-026-00670-z
Here you can find the citation in BibTeX format:
@article{delgobbo2026spacefilling,
author = {del Gobbo, Emiliano and Fontanella, Lara and Ippoliti, Luigi and Di Zio, Simone and Fontanella, Sara and Cucco, Alex},
title = {A space-filling sampling approach for collective classification of social media data},
journal = {Advances in Data Analysis and Classification},
year = {2026},
doi = {10.1007/s11634-026-00670-z},
url = {https://doi.org/10.1007/s11634-026-00670-z}
}
Here the citation in RIS format:
TY - JOUR
AU - del Gobbo, Emiliano
AU - Fontanella, Lara
AU - Ippoliti, Luigi
AU - Di Zio, Simone
AU - Fontanella, Sara
AU - Cucco, Alex
PY - 2026
TI - A space-filling sampling approach for collective classification of social media data
T2 - Advances in Data Analysis and Classification
DO - 10.1007/s11634-026-00670-z
UR - https://doi.org/10.1007/s11634-026-00670-z
ER -
License
The library is released under the GNU LESSER GENERAL PUBLIC LICENSE v3 (see LICENSE).
The artworks (library logo) have Proprietary license and cannot be used for other projects.
If you fork the project, you MUST indicate it is a derivate project and not
use the logo to identify the project (see artworks/LICENSE.rst).
Copyright (C) 2021-2026 Emiliano del Gobbo
Emiliano del Gobbo emidelgo@gmail.com
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file networksampler-0.9.5.tar.gz.
File metadata
- Download URL: networksampler-0.9.5.tar.gz
- Upload date:
- Size: 477.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
84cd909d21ad8d21ce4bfa8f4d59122510d5aae429f93f2e309589c9781d2625
|
|
| MD5 |
a7236263fa0ba271a7c8bbe58bf33945
|
|
| BLAKE2b-256 |
9927c52a569aceea66ccaf90818540c3fe9706014164cf3dd12c71c7d6c3305d
|
Provenance
The following attestation bundles were made for networksampler-0.9.5.tar.gz:
Publisher:
publish-pypi.yml on edgresearch/pylib_networksampler
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
networksampler-0.9.5.tar.gz -
Subject digest:
84cd909d21ad8d21ce4bfa8f4d59122510d5aae429f93f2e309589c9781d2625 - Sigstore transparency entry: 1058677289
- Sigstore integration time:
-
Permalink:
edgresearch/pylib_networksampler@f47f51df11d03a0efca1a1b236ef18eac5586b16 -
Branch / Tag:
refs/tags/v0.9.5 - Owner: https://github.com/edgresearch
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-pypi.yml@f47f51df11d03a0efca1a1b236ef18eac5586b16 -
Trigger Event:
release
-
Statement type:
File details
Details for the file networksampler-0.9.5-py3-none-any.whl.
File metadata
- Download URL: networksampler-0.9.5-py3-none-any.whl
- Upload date:
- Size: 10.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fe3d6f730f8c4e0ff2273fb8e3afa9e9f68b62c037b48f62d538efdeab5357d5
|
|
| MD5 |
9ae178b466c2b11f278c659ee2ddb3a8
|
|
| BLAKE2b-256 |
206a76b7103c955d60bf3b8c5d9abc2bfc3b6109e749f3f34b3224f4425a02c1
|
Provenance
The following attestation bundles were made for networksampler-0.9.5-py3-none-any.whl:
Publisher:
publish-pypi.yml on edgresearch/pylib_networksampler
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
networksampler-0.9.5-py3-none-any.whl -
Subject digest:
fe3d6f730f8c4e0ff2273fb8e3afa9e9f68b62c037b48f62d538efdeab5357d5 - Sigstore transparency entry: 1058677293
- Sigstore integration time:
-
Permalink:
edgresearch/pylib_networksampler@f47f51df11d03a0efca1a1b236ef18eac5586b16 -
Branch / Tag:
refs/tags/v0.9.5 - Owner: https://github.com/edgresearch
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-pypi.yml@f47f51df11d03a0efca1a1b236ef18eac5586b16 -
Trigger Event:
release
-
Statement type: