Preparing data for regression discontinuity design
Project description
geoRDDprep
geoRDDprep is a Python package designed to streamline the data preparation process for Geographical Regression Discontinuity Design (GeoRDD). It provides efficient tools for spatial joins, polygon-to-line conversions, and implementing the Turner et al. (2014) algorithm for assigning points to boundaries.
Features
points_in_polygon: Efficiently assign points to polygons (e.g., addresses to school districts).turner: Assign points to LineStrings based on orthogonal distance criteria (Turner et al., 2014).poly_to_line: Convert Polygon geometries to LineStrings for boundary analysis.drop_tiny_lines: Filter out small, noisy line segments to improve analysis quality.remove_sliver: Clean up sliver polygons using Voronoi diagrams.remove_overlaps: Remove overlapping segments between line datasets.
Installation
You can install the package directly from the source:
pip install .
Or, if you are developing:
pip install -e .
Usage
1. Assign Points to Polygons
Map addresses or other points to their respective administrative regions.
import geopandas as gpd
from geoRDDprep import points_in_polygon
# Load your data
points = gpd.read_file("addresses.geojson")
districts = gpd.read_file("districts.geojson")
# Assign points to districts
# The resulting GeoDataFrame will have columns from 'districts' suffixed with '_district'
result = points_in_polygon(points, districts, suffix_name="_district")
2. Prepare Boundaries (Polygons to Lines)
Convert polygon boundaries into lines for distance analysis.
from geoRDDprep import poly_to_line, drop_tiny_lines
# Convert polygons to lines
lines = poly_to_line(districts)
# Clean up noise by dropping very short lines (e.g., < 500 meters)
clean_lines = drop_tiny_lines(lines, method='length', meters=500)
3. Turner Algorithm
Assign points to boundaries based on distance and orthogonality.
from geoRDDprep import turner
# Match points to the nearest boundary within 15 meters
# 'turner_pass' column will be True if the point satisfies the criteria
matched_data = turner(points, clean_lines, orth_distance=15)
Requirements
geopandasshapelynumpypandasscipy
License
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file georddprep-0.1.1.tar.gz.
File metadata
- Download URL: georddprep-0.1.1.tar.gz
- Upload date:
- Size: 8.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
dfd4f1c1c6cde1065f6aab188f1488da4e17f5c6982059c9e179c3666b5b02ef
|
|
| MD5 |
38787b6d2924fc7cd36b9bd40c5035bd
|
|
| BLAKE2b-256 |
6cde7dafd1d6cf78e50bf1fbad07243483e27f39ed2f9faa8b49249c6e03a5fc
|
File details
Details for the file georddprep-0.1.1-py3-none-any.whl.
File metadata
- Download URL: georddprep-0.1.1-py3-none-any.whl
- Upload date:
- Size: 7.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
77157ca5c2287a6aa0004db19654dbfa33a90c80fbf20e65da66519175e25610
|
|
| MD5 |
4312af2cef6126e249fd958f8597a6db
|
|
| BLAKE2b-256 |
d1c3f2ad1e6209dac203df18b5fb9cbd7b061e1ca900fb74d1b5565a8b1f1fcb
|