Skip to main content

Python package for Graphical Sampling Method

Project description

graphical-sampling

graphical-sampling is a Python package for finite-population sampling, with a particular focus on graphical sampling designs, unequal inclusion probabilities, and spatially well-spread samples.

The package implements the Graphical Finite-Population Sampling (GFS) framework and its spatial extensions, including probability-balanced n-means clustering, nested spatial ordering, and intelligent search procedures for improving spatial spread while preserving prescribed first-order inclusion probabilities.

The package is designed for researchers and practitioners working in survey sampling, spatial statistics, environmental monitoring, ecological sampling, agricultural surveys, and related fields.


Main Features

  • Construct fixed-size sampling designs with prescribed first-order inclusion probabilities.

  • Represent sampling designs through the graphical/bar construction of GFS.

  • Draw samples from the resulting design.

  • Compute design properties such as:

    • first-order inclusion probabilities,
    • second-order inclusion probabilities,
    • entropy and relative entropy,
    • exact Narain--Horvitz--Thompson variance when the response variable is supplied.
  • Build probability-balanced spatial clusters using FIP-balanced n-means.

  • Create nested cluster-zone structures for spatial sampling.

  • Evaluate spatial spread using indices such as:

    • Moran-type spatial balance,
    • Voronoi-based spread,
    • Density Disparity Index,
    • local balance measures.
  • Improve sampling designs using intelligent search procedures such as Greedy Best-First Search.


Installation

Install the package from PyPI:

pip install graphical-sampling

or install the development version from GitHub:

pip install git+https://github.com/mehdimhb/graphical-sampling.git

Then import the package in Python:

import graphical_sampling

Depending on the installation version, the main classes can also be imported directly from their submodules.


Basic Example

The following example constructs a finite population with spatial coordinates, unequal inclusion probabilities, and a response variable. It then builds a graphical sampling design and draws samples from it.

import numpy as np

from graphical_sampling.population import Population
from graphical_sampling.design import Design

# Reproducibility
rng = np.random.default_rng(123)

# Population size and sample size
N = 200
n = 20

# Spatial coordinates
coords = rng.random((N, 2))

# Unequal size measure, normalized internally to sum to n
weights = 0.5 + rng.random(N)

# Example response variable
y = coords[:, 0] + coords[:, 1] + rng.normal(scale=0.1, size=N)

# Create the finite population
pop = Population(
    coords=coords,
    inclusions=weights,
    variable=y,
    n=n
)

# Build a graphical sampling design
design = Design(population=pop)

# Draw five samples
samples = design.sample(num_samples=5)

print(samples)
print("Relative entropy:", design.relative_entropy)
print("NHT variance:", design.nht_variance)

Spatial Sampling with FIP-Balanced n-Means

The package also provides probability-balanced spatial clustering. This is useful when the aim is to form compact spatial clusters whose total inclusion probabilities are controlled exactly.

from graphical_sampling.population import Population
from graphical_sampling.design import Design
from graphical_sampling.order import Order
from graphical_sampling.clustering.fip_balanced_nmeans import FIPBalancedNMeans

# Fit FIP-balanced n-means clustering
fbn = FIPBalancedNMeans(
    n=n,
    n_init=20,
    init_clust_method="expanded"
)

fbn.fit(population=pop)

# Optionally divide each cluster into internal zones
fbn.fit_zones(
    num_zones=(2, 2),
    mode="sweep_xy"
)

# Build a spatial order from the cluster-zone structure
order = Order.from_clusters(
    population=pop,
    clusters=fbn.clusters,
    zone_strategy="snake",
    point_strategy="snake"
)

# Construct the corresponding spatial graphical design
spatial_design = Design.from_order(pop, order)

print("Moran index:", spatial_design.moran)
print("Voronoi index:", spatial_design.voronoi)
print("Density disparity:", spatial_design.density_disparity)

Intelligent Spatial Sampling

The package includes search tools for improving a sampling design while preserving design validity. These methods modify the graphical order or exchange probability mass in a controlled way, and therefore maintain the prescribed inclusion probabilities.

A typical workflow is:

  1. Create a Population.
  2. Build an initial design using GFS or FIP-balanced n-means clustering.
  3. Choose a criterion, such as a spatial spread index or a weighted combination of indices.
  4. Run an intelligent search algorithm to improve the design.
  5. Use the optimized design for sampling and design-based inference.

Citation

If you use graphical-sampling, please cite the software package. If you use the spatial clustering or intelligent spatial sampling methods, please also cite the corresponding methodological paper.

Software citation

@software{graphical_sampling_2025,
  author = {Panahbehagh, Bardia and Mohebbi, Mehdi and HosseiniNasab, Amir Mohammad and Hosseini Moghadam, Mehdi},
  title = {graphical-sampling: A Python package for graphical finite-population and spatial sampling},
  year = {2025},
  url = {https://github.com/mehdimhb/graphical-sampling},
  note = {Python package}
}

Methodological papers

For the graphical finite-population sampling framework, cite:

@article{panahbehagh2026geometric,
  author = {Panahbehagh, Bardia},
  title = {Graphical Finite-Population Sampling},
  year = {2026},
  note = {Manuscript}
}

For the spatial sampling design, cite:

@article{panahbehagh2026intelligent,
  author = {Panahbehagh, Bardia and Mohebbi, Mehdi},
  title = {Intelligent n-Means Spatial Sampling},
  year = {2026},
  note = {Manuscript}
}

For the spatial spread measure, cite:

@article{panahbehagh2026spread,
  author = {Panahbehagh, Bardia and Mohebbi, Mehdi and HosseiniNasab, Amir Mohammad},
  title = {Measuring Spatial Spread via n-Means Balanced Clustering},
  year = {2026},
  note = {Manuscript}
}

Please replace the manuscript entries with the final journal citation once the papers are published.


Maintainers

  • Bardia Panahbehagh
  • Mehdi Mohebbi
  • Amir Mohammad HosseiniNasab
  • Mehdi Hosseini Moghadam

License

License information should be checked in the repository before redistribution.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

graphical_sampling-1.0.0.tar.gz (265.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

graphical_sampling-1.0.0-py3-none-any.whl (82.3 kB view details)

Uploaded Python 3

File details

Details for the file graphical_sampling-1.0.0.tar.gz.

File metadata

  • Download URL: graphical_sampling-1.0.0.tar.gz
  • Upload date:
  • Size: 265.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for graphical_sampling-1.0.0.tar.gz
Algorithm Hash digest
SHA256 6b0818f3cd427a0da2d0e7f9247746d6c375751e86854e5c5bc6e04f53f65b55
MD5 180fe16ca0a483bda73ee630bc512000
BLAKE2b-256 110c7d845d739a1d16f27d98e3c0ea2075e7d291a9e33d11c1c9e6640b19c3a1

See more details on using hashes here.

File details

Details for the file graphical_sampling-1.0.0-py3-none-any.whl.

File metadata

File hashes

Hashes for graphical_sampling-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 594e4ae5c8be38e44f134fb744d9c754d85005ba54d079d61090e9d886130fdc
MD5 15b744be48e3a285eb2e757b6457e689
BLAKE2b-256 7c30acd3488becf0d622dc599ab17d62b3051b52c06721ee07ed9f604548a632

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page