Spatial Point Pattern Test (SPPT) for Aggregated Data — bootstrap-based spatial distribution comparison
Project description
sppt — Spatial Point Pattern Test for Aggregated Data
A Python implementation of the Spatial Point Pattern Test (SPPT) for aggregated count data. Uses bootstrap resampling to compare spatial distributions between variables and calculates S-Index metrics to quantify spatial pattern overlap.
Based on the original R package
sppt.aggregated.databy Martin A. Andresen. This Python port faithfully reimplements the statistical methods, algorithms, and outputs of the R version.
Features
- Bootstrap resampling with sparse-matrix acceleration (
scipy.sparse+numpy) - S-Index & Robust S-Index for quantifying spatial pattern overlap
- Bivariate comparison (base vs. test variable) with directional change detection
- Percentage or count mode — compare spatial distributions or absolute values
- Fixed base option — bootstrap only the test variable when the base is known
- Automatic choropleth maps via
matplotlib+geopandas - Multiple export formats — Shapefile, GeoPackage, CSV, TXT, Pickle
- Google Colab compatible — works out of the box in cloud notebooks
Installation
pip install sppt
For development:
git clone https://github.com/yunusserhat/sppt.git
cd sppt
pip install -e ".[dev]"
Quick Start
import geopandas as gpd
from sppt import sppt
# Load spatial data
data = gpd.read_file("your_data.shp")
# Compare two variables across spatial units
result = sppt(
data=data,
group_col="DAUID", # spatial unit identifier
count_col=["Crime_2020", "Crime_2021"], # [base, test]
B=200, # bootstrap samples
check_overlap=True, # compute S-Index
seed=42, # reproducibility
)
# Access results
print(result.s_index) # e.g. 0.7380
print(result.robust_s_index) # e.g. 0.7289
print(result.data.head()) # DataFrame with CI bounds + overlap columns
How It Works
Algorithm
- Expand aggregated counts to individual events (uncount)
- Build a sparse one-hot matrix (n × G) for group membership
- Draw B multinomial bootstrap samples
- Aggregate via matrix multiply:
group_counts = onehot.T @ W - Convert to percentages (optional) and extract quantile-based confidence intervals
- Compare intervals between variables to detect significant spatial changes
S-Index Interpretation
| S-Index | Meaning |
|---|---|
| 1.0 | Perfect overlap — no spatial pattern change |
| 0.5 | Half the areas show significant change |
| 0.0 | Complete spatial difference |
The Robust S-Index excludes spatial units where all variables are zero.
SIndex_Bivariate (per spatial unit)
| Value | Meaning |
|---|---|
| -1 | Base > Test (decline) |
| 0 | No significant difference |
| +1 | Test > Base (increase) |
Parameters
| Parameter | Default | Description |
|---|---|---|
data |
— | GeoDataFrame or DataFrame with count data |
group_col |
"group" |
Column identifying spatial units |
count_col |
— | Column name(s) with counts. Pass ["base", "test"] for bivariate |
B |
200 |
Number of bootstrap samples |
seed |
None |
Random seed for reproducibility |
conf_level |
0.95 |
Confidence level for intervals |
check_overlap |
False |
Compute overlap + S-Index statistics |
fix_base |
False |
Skip bootstrapping the base (first) variable |
use_percentages |
True |
Compare spatial distributions (%) vs. raw counts |
create_maps |
True |
Generate choropleth map for bivariate case |
export_maps |
False |
Save map to disk |
export_dir |
None |
Directory for map export |
map_dpi |
300 |
Resolution for exported maps |
export_results |
False |
Save results to disk |
export_format |
"shp" |
Format: "shp", "gpkg", "csv", "txt", "pickle" |
export_results_dir |
None |
Directory for results export |
Examples
Example 1: Vancouver Crime Data
import geopandas as gpd
from sppt import sppt
data = gpd.read_file("Vancouver_DAs_Crime_2021.shp")
data = data.to_crs(epsg=26910)
result = sppt(
data=data,
group_col="DAUID",
count_col=["TFV", "TOV"], # Total Family Violence vs Total Other Violence
B=200,
check_overlap=True,
create_maps=True,
seed=171717,
)
Output:
========================================
Spatial Pattern Overlap Statistics
Using: Percentages (spatial distribution)
========================================
S-Index: 0.7380
Robust S-Index: 0.7289
----------------------------------------
Total observations: 1019
Observations with overlap: 752
Observations with non-zero counts: 985
========================================
Example 2: Fixed Base Variable
result = sppt(
data=data,
group_col="DAUID",
count_col=["Census_Official", "Survey_Estimate"],
B=200,
fix_base=True, # don't bootstrap the census data
check_overlap=True,
seed=42,
)
Example 3: Count Mode
result = sppt(
data=data,
group_col="DAUID",
count_col=["Crime_2020", "Crime_2021"],
B=200,
use_percentages=False, # compare absolute counts
check_overlap=True,
seed=42,
)
Example 4: Export Results
result = sppt(
data=data,
group_col="DAUID",
count_col=["TFV", "TOV"],
B=500,
check_overlap=True,
export_results=True,
export_format="gpkg", # GeoPackage
export_results_dir="output/",
export_maps=True,
export_dir="output/maps/",
map_dpi=600, # publication quality
seed=171717,
)
Interactive Notebooks
| Notebook | Description | Colab |
|---|---|---|
| Quickstart | Basic usage with Vancouver crime data | |
| Advanced Examples | All modes, export, publication maps |
Sample Data
The package includes the Vancouver Dissemination Areas Crime 2021 dataset (1,019 polygons) for testing:
from sppt import load_sample_data
data = load_sample_data()
print(data.columns)
# ['DAUID', 'DGUID', 'LANDAREA', 'PRUID', 'BNEC', 'BNER',
# 'MISCHIEF', 'TFV', 'THEFT', 'TOB', 'TOV', 'geometry']
Output Columns
After running sppt(), your data gains these columns:
| Column | Description |
|---|---|
{var}_L |
Lower bound of confidence interval |
{var}_U |
Upper bound of confidence interval |
intervals_overlap |
1 if CIs overlap, 0 otherwise |
SIndex_Bivariate |
-1 (base > test), 0 (overlap), 1 (test > base) |
Citation
If you use this package in your research, please cite both the Python package and the original R implementation:
@software{bicakci2026sppt,
author = {Bıçakçı, Yunus Serhat},
title = {sppt: Spatial Point Pattern Test for Aggregated Data (Python)},
year = {2026},
url = {https://github.com/yunusserhat/sppt},
note = {Python implementation based on the R package by Martin A. Andresen}
}
@software{andresen2025sppt,
author = {Andresen, Martin A.},
title = {sppt.aggregated.data: Spatial Point Pattern Test for Aggregated Data (R)},
year = {2025},
url = {https://github.com/martin-a-andresen/sppt.aggregated.data}
}
Acknowledgements
This package is a faithful Python reimplementation of the R package sppt.aggregated.data created by Martin A. Andresen. The statistical methodology, bootstrap algorithm, S-Index calculations, and output structure are directly based on his original work.
License
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file sppt-0.1.0.tar.gz.
File metadata
- Download URL: sppt-0.1.0.tar.gz
- Upload date:
- Size: 299.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1f5cca993482b98bd08eeff6e759eaf337d8af1ed9cb5cd3c8dfb93bb17e279a
|
|
| MD5 |
16b0d749f7d81aad79ecc155f88a5b33
|
|
| BLAKE2b-256 |
d81484b149c66a5f6921cd96b474e46dce83f91d9cb4c60a202a5e5e40d040ec
|
Provenance
The following attestation bundles were made for sppt-0.1.0.tar.gz:
Publisher:
ci.yml on yunusserhat/sppt
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
sppt-0.1.0.tar.gz -
Subject digest:
1f5cca993482b98bd08eeff6e759eaf337d8af1ed9cb5cd3c8dfb93bb17e279a - Sigstore transparency entry: 1004783187
- Sigstore integration time:
-
Permalink:
yunusserhat/sppt@693e416c599405d888d0d0d4b06057084d13824f -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/yunusserhat
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
ci.yml@693e416c599405d888d0d0d4b06057084d13824f -
Trigger Event:
push
-
Statement type:
File details
Details for the file sppt-0.1.0-py3-none-any.whl.
File metadata
- Download URL: sppt-0.1.0-py3-none-any.whl
- Upload date:
- Size: 294.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
dcb695c802db8c3cb401746a9640a410dfbae1ebc54c846079e81be53b7d1eb5
|
|
| MD5 |
4aea707e58f10fb8ccf02dd2534a8dc9
|
|
| BLAKE2b-256 |
756c53fb9b2befa3deb2bb2f7764587f15bc1aed0739b8efc9287e72b3c26056
|
Provenance
The following attestation bundles were made for sppt-0.1.0-py3-none-any.whl:
Publisher:
ci.yml on yunusserhat/sppt
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
sppt-0.1.0-py3-none-any.whl -
Subject digest:
dcb695c802db8c3cb401746a9640a410dfbae1ebc54c846079e81be53b7d1eb5 - Sigstore transparency entry: 1004783195
- Sigstore integration time:
-
Permalink:
yunusserhat/sppt@693e416c599405d888d0d0d4b06057084d13824f -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/yunusserhat
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
ci.yml@693e416c599405d888d0d0d4b06057084d13824f -
Trigger Event:
push
-
Statement type: