Skip to main content

Preparing data for regression discontinuity design

Project description

🌍 geoRDDprep

PyPI version License: MIT Python 3.6+

Streamline your Geographical Regression Discontinuity Design (GeoRDD) workflow.

geoRDDprep is a powerful Python toolkit designed to take the pain out of spatial data preparation. Whether you are an economist, political scientist, or data analyst, this package helps you assign points to boundaries, clean up messy polygons, and implement rigorous spatial algorithms with ease.

🚀 Why geoRDDprep?

  • ⚡️ Fast & Efficient: Optimized spatial joins and geometric operations using geopandas and shapely.
  • 📐 Turner Algorithm Ready: Out-of-the-box implementation of the orthogonal distance criteria from Turner et al. (2014).
  • 🧹 Data Cleaning: Automatically remove sliver polygons and tiny, noisy line segments that mess up your analysis.
  • 🛠️ Easy Integration: Works seamlessly with your existing pandas and geopandas workflows.

📦 Installation

Install directly from PyPI:

pip install geoRDDprep

🛠️ Usage Examples

1. Assign Addresses to Districts

Map millions of points to their administrative regions in seconds.

import geopandas as gpd
from geoRDDprep import points_in_polygon

# Load your data
points = gpd.read_file("addresses.geojson")
districts = gpd.read_file("school_districts.geojson")

# 🪄 Magic happens here
result = points_in_polygon(points, districts, suffix_name="_district")

print(result.head())

2. The Turner Algorithm (2014)

Assign points to boundaries only if they are within a strict orthogonal distance—crucial for valid RDD analysis.

from geoRDDprep import poly_to_line, drop_tiny_lines, turner

# 1. Convert polygons to boundary lines
lines = poly_to_line(districts)

# 2. Clean up noise (remove lines < 500m)
clean_lines = drop_tiny_lines(lines, method='length', meters=500)

# 3. Match points to boundaries (within 15m)
matched_data = turner(points, clean_lines, orth_distance=15)

# Check which points passed the test
print(matched_data['turner_pass'].value_counts())

3. Clean Messy Polygons

Got "slivers" or gaps in your map? Fix them automatically.

from geoRDDprep import remove_sliver

# Merge slivers into their largest neighbors
clean_polygons = remove_sliver(messy_polygons, boundary_clip)

🤝 Contributing

We love contributions!

  1. Fork the repo.
  2. Create a branch (git checkout -b feature/AmazingFeature).
  3. Commit your changes (git commit -m 'Add AmazingFeature').
  4. Push to the branch (git push origin feature/AmazingFeature).
  5. Open a Pull Request.

📄 License

Distributed under the MIT License. See LICENSE for more information.


Built with ❤️ for the spatial data community.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

georddprep-0.1.4.tar.gz (9.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

georddprep-0.1.4-py3-none-any.whl (7.4 kB view details)

Uploaded Python 3

File details

Details for the file georddprep-0.1.4.tar.gz.

File metadata

  • Download URL: georddprep-0.1.4.tar.gz
  • Upload date:
  • Size: 9.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.5

File hashes

Hashes for georddprep-0.1.4.tar.gz
Algorithm Hash digest
SHA256 f008e51b09fdcb29e34b92ebf246a96ae0a783da227ce62279898010794d56d7
MD5 80b8e923fb46aa188db8ca24a0eca4ff
BLAKE2b-256 b88fb01cbd3a5c34cf6d8b9ea42e940c5a1bef91eb29ebaa089a2e32d4077d8d

See more details on using hashes here.

File details

Details for the file georddprep-0.1.4-py3-none-any.whl.

File metadata

  • Download URL: georddprep-0.1.4-py3-none-any.whl
  • Upload date:
  • Size: 7.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.5

File hashes

Hashes for georddprep-0.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 69c019c3caa544c12669e70d730937c90d2bd79307a46454c184ffbff3ce21dd
MD5 748863e933fbd5ddb41f58ecd10a992a
BLAKE2b-256 d0a36404144eb11b8a8217f7fb5c5a2b903489965b80903ca9f0868e0ab6e32e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page