Skip to main content

Sub-National Geospatial Data Archive: Geoprocessing Toolkit (Python)

Project description

pysungeo

Sub-National Geospatial Data Archive: Geoprocessing Toolkit for Python

Python 3.10+ License: GPL-2.0 Tests

pysungeo is a Python port of the SUNGEO R package, providing tools for integrating geospatial datasets that use different geographic boundary systems. It solves the change-of-support problem — transferring data between misaligned spatial units (e.g., electoral precincts to administrative districts to hexagonal grids).

Key Features

  • Polygon-to-polygon transfer — Area-weighted interpolation with pycnophylactic (mass-preserving) correction
  • Point-to-polygon interpolation — Simple aggregation, Voronoi tessellation, and Ordinary/Universal Kriging
  • Line-to-polygon metrics — Road length, density, and distance calculations within polygons
  • Nesting diagnostics — 12 metrics measuring how well one boundary set nests within another
  • Raster conversion — Polygon/point to raster and back, with round-trip fidelity
  • Spatial statistics — Getis-Ord Gi* hot spot analysis
  • Geocoding — Address-to-coordinate conversion via OpenStreetMap Nominatim
  • SUNGEO API access — Download sub-national data for 180+ countries directly

Installation

pip install pysungeo

From source

git clone https://github.com/zhukov/pysungeo.git
cd pysungeo
pip install -e ".[dev]"

Quick Start

import geopandas as gpd
from sungeo.utm_select import utm_select
from sungeo.poly2poly_ap import poly2poly_ap
from sungeo.hot_spot import hot_spot

# Load source and destination boundary sets
precincts = gpd.read_file("precincts.gpkg")
districts = gpd.read_file("districts.gpkg")

# Transfer turnout data from precincts to districts (area-weighted)
result = poly2poly_ap(
    poly_from=precincts,
    poly_to=districts,
    poly_to_id="DISTRICT_ID",
    varz="turnout",
)

# Identify spatial clusters
hotspots = hot_spot(insert=result, variable="turnout_aw")

Available Functions

Spatial Interpolation

Function Description
poly2poly_ap Area-weighted polygon-to-polygon transfer
point2poly_simp Simple point-in-polygon aggregation
point2poly_tess Voronoi tessellation interpolation
point2poly_krige Ordinary and Universal Kriging
line2poly Line length, density, and distance within polygons

Spatial Analysis

Function Description
nesting 12 metrics for boundary set compatibility
hot_spot Getis-Ord Gi* local spatial clustering
sf2raster Polygon/point ↔ raster conversion

Utilities

Function Description
utm_select Auto-select optimal projected CRS
fix_geom Repair invalid geometries
df2sf DataFrame with coordinates → GeoDataFrame
update_bbox Refresh GeoDataFrame bounding box
smart_round Round with significant digit preservation
make_ticker Date-to-ID mapping table
merge_list Recursive outer-join merge

Data Access

Function Description
get_data Download from the SUNGEO API
get_info Browse the SUNGEO data catalog
geocode_osm Geocode addresses via Nominatim
geocode_osm_batch Batch geocoding with rate limiting

Examples

Transfer data between boundary sets

from sungeo.poly2poly_ap import poly2poly_ap

# Area-weighted transfer of election turnout from precincts to hex grid
result = poly2poly_ap(
    poly_from=precincts,
    poly_to=hex_grid,
    poly_to_id="HEX_ID",
    varz="turnout",
)

Kriging interpolation from weather stations to districts

from sungeo.point2poly_krige import point2poly_krige

result = point2poly_krige(
    pointz=weather_stations,
    polyz=districts,
    yvarz="temperature",
)
# Result includes temperature.pred, temperature.var, temperature.stdev

Check boundary compatibility

from sungeo.nesting import nesting

metrics = nesting(
    poly_from=precincts,
    poly_to=districts,
    metrix="all",
)
# metrics["rn"] close to 1.0 = good nesting

Download SUNGEO data

from sungeo.get_data import get_data

df = get_data(
    country_names="Germany",
    topics="Demographics:Population:GHS",
    year_min=2000,
    year_max=2020,
)

Requirements

  • Python ≥ 3.10
  • geopandas ≥ 0.14
  • shapely ≥ 2.0
  • numpy ≥ 1.24
  • pandas ≥ 2.0
  • scipy ≥ 1.10
  • rasterio ≥ 1.3
  • rasterstats ≥ 0.19
  • pyproj ≥ 3.5
  • pykrige ≥ 1.7
  • esda ≥ 2.5
  • libpysal ≥ 4.9
  • requests ≥ 2.31

Testing

pip install -e ".[dev]"
pytest

533 tests passing, covering all 20 functions with R cross-validation.

Citation

If you use this package in published research, please cite:

Zhukov, Yuri M., Jason S. Byers, Marty Davidson, and Ye Chan Kim. 2025. "pysungeo: Sub-National Geospatial Data Archive — Geoprocessing Toolkit for Python." https://github.com/zhukov/pysungeo

And the original R package:

Zhukov, Yuri M., Jason S. Byers, and Marty Davidson. 2024. "SUNGEO: Sub-National Geospatial Data Archive: Geoprocessing Toolkit." R package. https://github.com/zhukov/SUNGEO

License

GPL-2.0. See LICENSE for details.

Links

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pysungeo-0.1.1.tar.gz (1.1 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pysungeo-0.1.1-py3-none-any.whl (1.1 MB view details)

Uploaded Python 3

File details

Details for the file pysungeo-0.1.1.tar.gz.

File metadata

  • Download URL: pysungeo-0.1.1.tar.gz
  • Upload date:
  • Size: 1.1 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.0

File hashes

Hashes for pysungeo-0.1.1.tar.gz
Algorithm Hash digest
SHA256 1344015653e3170e6805ba8a3d6150ad688e856ffc1264932b0dd0051b7c8dcd
MD5 dbb76679cda9cb7bccc0991c58fd9482
BLAKE2b-256 0c6025c0733f75a2faca4d9087ff2b05fab6a7317c14eb2da9028484bb0df53c

See more details on using hashes here.

File details

Details for the file pysungeo-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: pysungeo-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 1.1 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.0

File hashes

Hashes for pysungeo-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 b66f66b4ab0fd509994107981132062e0e33fc37b18b7a0246a2aaa96f4282e8
MD5 20ca736b5be659d204cb5062897289cb
BLAKE2b-256 289f1f22f0fdc7df78b81bd1d8d4f4f6699c52f2a40a13fe067c101f4f929a0c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page