Skip to main content

Preparing data for regression discontinuity design

Project description

🌍 geoRDDprep

PyPI version License: MIT Python 3.6+

Streamline your Geographical Regression Discontinuity Design (GeoRDD) workflow.

geoRDDprep is a powerful Python toolkit designed to take the pain out of spatial data preparation. Whether you are an economist, political scientist, or data analyst, this package helps you assign points to boundaries, clean up messy polygons, and implement rigorous spatial algorithms with ease.

🚀 Why geoRDDprep?

  • ⚡️ Fast & Efficient: Optimized spatial joins and geometric operations using geopandas and shapely.
  • 📐 Turner Algorithm Ready: Out-of-the-box implementation of the orthogonal distance criteria from Turner et al. (2014).
  • 🧹 Data Cleaning: Automatically remove sliver polygons and tiny, noisy line segments that mess up your analysis.
  • 🛠️ Easy Integration: Works seamlessly with your existing pandas and geopandas workflows.

📦 Installation

Install directly from PyPI:

pip install geoRDDprep

🛠️ Usage Examples

1. Assign Addresses to Districts

Map millions of points to their administrative regions in seconds.

import geopandas as gpd
from geoRDDprep import points_in_polygon

# Load your data
points = gpd.read_file("addresses.geojson")
districts = gpd.read_file("school_districts.geojson")

# 🪄 Magic happens here
result = points_in_polygon(points, districts, suffix_name="_district")

print(result.head())

2. The Turner Algorithm (2014)

Assign points to boundaries only if they are within a strict orthogonal distance—crucial for valid RDD analysis.

from geoRDDprep import poly_to_line, drop_tiny_lines, turner

# 1. Convert polygons to boundary lines
lines = poly_to_line(districts)

# 2. Clean up noise (remove lines < 500m)
clean_lines = drop_tiny_lines(lines, method='length', meters=500)

# 3. Match points to boundaries (within 15m)
matched_data = turner(points, clean_lines, orth_distance=15)

# Check which points passed the test
print(matched_data['turner_pass'].value_counts())

3. Clean Messy Polygons

Got "slivers" or gaps in your map? Fix them automatically.

from geoRDDprep import remove_sliver

# Merge slivers into their largest neighbors
clean_polygons = remove_sliver(messy_polygons, boundary_clip)

🤝 Contributing

We love contributions!

  1. Fork the repo.
  2. Create a branch (git checkout -b feature/AmazingFeature).
  3. Commit your changes (git commit -m 'Add AmazingFeature').
  4. Push to the branch (git push origin feature/AmazingFeature).
  5. Open a Pull Request.

📄 License

Distributed under the MIT License. See LICENSE for more information.


Built with ❤️ for the spatial data community.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

georddprep-0.1.3.tar.gz (8.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

georddprep-0.1.3-py3-none-any.whl (6.8 kB view details)

Uploaded Python 3

File details

Details for the file georddprep-0.1.3.tar.gz.

File metadata

  • Download URL: georddprep-0.1.3.tar.gz
  • Upload date:
  • Size: 8.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.5

File hashes

Hashes for georddprep-0.1.3.tar.gz
Algorithm Hash digest
SHA256 6566b274c0bd3691f856d270acc2ae62577c2be639922ff90001aa8fbf2f12f6
MD5 23416346d46f0880615ea912ae001fc2
BLAKE2b-256 ed7d9a08e6eccc55b1c39d585181a157b2fc1ad2f9d80fef1224b20420fa0f7f

See more details on using hashes here.

File details

Details for the file georddprep-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: georddprep-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 6.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.5

File hashes

Hashes for georddprep-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 d459f94872d4acb4ee7eab25c3ed1d90f1ffd2a0c16c53d9056388a1573535fb
MD5 55a6dce6127d57234bc682e8120b8588
BLAKE2b-256 97683ade675ea9e2ee50fa66f81cdcfb3b0b97acf39d3a741a92fd515bb188d5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page