Skip to main content

Sub-National Geospatial Data Archive: Geoprocessing Toolkit (Python)

Project description

pysungeo

Sub-National Geospatial Data Archive: Geoprocessing Toolkit for Python

Python 3.10+ License: GPL-2.0 Tests

pysungeo is a Python port of the SUNGEO R package, providing tools for integrating geospatial datasets that use different geographic boundary systems. It solves the change-of-support problem — transferring data between misaligned spatial units (e.g., electoral precincts to administrative districts to hexagonal grids).

Key Features

  • Polygon-to-polygon transfer — Area-weighted interpolation with pycnophylactic (mass-preserving) correction
  • Point-to-polygon interpolation — Simple aggregation, Voronoi tessellation, and Ordinary/Universal Kriging
  • Line-to-polygon metrics — Road length, density, and distance calculations within polygons
  • Nesting diagnostics — 12 metrics measuring how well one boundary set nests within another
  • Raster conversion — Polygon/point to raster and back, with round-trip fidelity
  • Spatial statistics — Getis-Ord Gi* hot spot analysis
  • Geocoding — Address-to-coordinate conversion via OpenStreetMap Nominatim
  • SUNGEO API access — Download sub-national data for 180+ countries directly

Installation

pip install pysungeo

From source

git clone https://github.com/zhukov/pysungeo.git
cd pysungeo
pip install -e ".[dev]"

Quick Start

import geopandas as gpd
from sungeo.utm_select import utm_select
from sungeo.poly2poly_ap import poly2poly_ap
from sungeo.hot_spot import hot_spot

# Load source and destination boundary sets
precincts = gpd.read_file("precincts.gpkg")
districts = gpd.read_file("districts.gpkg")

# Transfer turnout data from precincts to districts (area-weighted)
result = poly2poly_ap(
    poly_from=precincts,
    poly_to=districts,
    poly_to_id="DISTRICT_ID",
    varz="turnout",
)

# Identify spatial clusters
hotspots = hot_spot(insert=result, variable="turnout_aw")

Available Functions

Spatial Interpolation

Function Description
poly2poly_ap Area-weighted polygon-to-polygon transfer
point2poly_simp Simple point-in-polygon aggregation
point2poly_tess Voronoi tessellation interpolation
point2poly_krige Ordinary and Universal Kriging
line2poly Line length, density, and distance within polygons

Spatial Analysis

Function Description
nesting 12 metrics for boundary set compatibility
hot_spot Getis-Ord Gi* local spatial clustering
sf2raster Polygon/point ↔ raster conversion

Utilities

Function Description
utm_select Auto-select optimal projected CRS
fix_geom Repair invalid geometries
df2sf DataFrame with coordinates → GeoDataFrame
update_bbox Refresh GeoDataFrame bounding box
smart_round Round with significant digit preservation
make_ticker Date-to-ID mapping table
merge_list Recursive outer-join merge

Data Access

Function Description
get_data Download from the SUNGEO API
get_info Browse the SUNGEO data catalog
geocode_osm Geocode addresses via Nominatim
geocode_osm_batch Batch geocoding with rate limiting

Examples

Transfer data between boundary sets

from sungeo.poly2poly_ap import poly2poly_ap

# Area-weighted transfer of election turnout from precincts to hex grid
result = poly2poly_ap(
    poly_from=precincts,
    poly_to=hex_grid,
    poly_to_id="HEX_ID",
    varz="turnout",
)

Kriging interpolation from weather stations to districts

from sungeo.point2poly_krige import point2poly_krige

result = point2poly_krige(
    pointz=weather_stations,
    polyz=districts,
    yvarz="temperature",
)
# Result includes temperature.pred, temperature.var, temperature.stdev

Check boundary compatibility

from sungeo.nesting import nesting

metrics = nesting(
    poly_from=precincts,
    poly_to=districts,
    metrix="all",
)
# metrics["rn"] close to 1.0 = good nesting

Download SUNGEO data

from sungeo.get_data import get_data

df = get_data(
    country_names="Germany",
    topics="Demographics:Population:GHS",
    year_min=2000,
    year_max=2020,
)

Requirements

  • Python ≥ 3.10
  • geopandas ≥ 0.14
  • shapely ≥ 2.0
  • numpy ≥ 1.24
  • pandas ≥ 2.0
  • scipy ≥ 1.10
  • rasterio ≥ 1.3
  • rasterstats ≥ 0.19
  • pyproj ≥ 3.5
  • pykrige ≥ 1.7
  • esda ≥ 2.5
  • libpysal ≥ 4.9
  • requests ≥ 2.31

Testing

pip install -e ".[dev]"
pytest

533 tests passing, covering all 20 functions with R cross-validation.

Citation

If you use this package in published research, please cite:

Zhukov, Yuri M., Jason S. Byers, Marty Davidson, and Ye Chan Kim. 2025. "pysungeo: Sub-National Geospatial Data Archive — Geoprocessing Toolkit for Python." https://github.com/zhukov/pysungeo

And the original R package:

Zhukov, Yuri M., Jason S. Byers, and Marty Davidson. 2024. "SUNGEO: Sub-National Geospatial Data Archive: Geoprocessing Toolkit." R package. https://github.com/zhukov/SUNGEO

License

GPL-2.0. See LICENSE for details.

Links

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pysungeo-0.1.0.tar.gz (1.1 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pysungeo-0.1.0-py3-none-any.whl (1.1 MB view details)

Uploaded Python 3

File details

Details for the file pysungeo-0.1.0.tar.gz.

File metadata

  • Download URL: pysungeo-0.1.0.tar.gz
  • Upload date:
  • Size: 1.1 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.0

File hashes

Hashes for pysungeo-0.1.0.tar.gz
Algorithm Hash digest
SHA256 6ae7f8f635cba693ecf669c667408657b718c906232f112d76e17dd94ac8b8fa
MD5 257c219339dc8abcbb88b7a457c71e2a
BLAKE2b-256 e9649d5b01e415515dab1c5d9923b6118f5b53f9db75ec916fdec021d3086dc8

See more details on using hashes here.

File details

Details for the file pysungeo-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: pysungeo-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 1.1 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.0

File hashes

Hashes for pysungeo-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 86da523306d6289726b9a01d4fd034ef2427b8574e3c1a2fedfee782c1b68e6d
MD5 797c6543717d92535bc9434d481c20e2
BLAKE2b-256 a47b358c6e53052c5e9ebb92256bcd84237e0fa0fa54a817935885288db6b07c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page