Skip to main content

Preparing data for regression discontinuity design

Project description

🌍 geoRDDprep

PyPI version License: MIT Python 3.6+

Streamline your Geographical Regression Discontinuity Design (GeoRDD) workflow.

geoRDDprep is a powerful Python toolkit designed to take the pain out of spatial data preparation. Whether you are an economist, political scientist, or data analyst, this package helps you assign points to boundaries, clean up messy polygons, and implement rigorous spatial algorithms with ease.

🚀 Why geoRDDprep?

  • ⚡️ Fast & Efficient: Optimized spatial joins and geometric operations using geopandas and shapely.
  • 📐 Turner Algorithm Ready: Out-of-the-box implementation of the orthogonal distance criteria from Turner et al. (2014).
  • 🧹 Data Cleaning: Automatically remove sliver polygons and tiny, noisy line segments that mess up your analysis.
  • 🛠️ Easy Integration: Works seamlessly with your existing pandas and geopandas workflows.

📦 Installation

Install directly from PyPI:

pip install geoRDDprep

🛠️ Usage Examples

1. Assign Addresses to Districts

Map millions of points to their administrative regions in seconds.

import geopandas as gpd
from geoRDDprep import points_in_polygon

# Load your data
points = gpd.read_file("addresses.geojson")
districts = gpd.read_file("school_districts.geojson")

# 🪄 Magic happens here
result = points_in_polygon(points, districts, suffix_name="_district")

print(result.head())

2. The Turner Algorithm (2014)

Assign points to boundaries only if they are within a strict orthogonal distance—crucial for valid RDD analysis.

from geoRDDprep import poly_to_line, drop_tiny_lines, turner

# 1. Convert polygons to boundary lines
lines = poly_to_line(districts)

# 2. Clean up noise (remove lines < 500m)
clean_lines = drop_tiny_lines(lines, method='length', meters=500)

# 3. Match points to boundaries (within 15m)
matched_data = turner(points, clean_lines, orth_distance=15)

# Check which points passed the test
print(matched_data['turner_pass'].value_counts())

3. Clean Messy Polygons

Got "slivers" or gaps in your map? Fix them automatically.

from geoRDDprep import remove_sliver

# Merge slivers into their largest neighbors
clean_polygons = remove_sliver(messy_polygons, boundary_clip)

🤝 Contributing

We love contributions!

  1. Fork the repo.
  2. Create a branch (git checkout -b feature/AmazingFeature).
  3. Commit your changes (git commit -m 'Add AmazingFeature').
  4. Push to the branch (git push origin feature/AmazingFeature).
  5. Open a Pull Request.

📄 License

Distributed under the MIT License. See LICENSE for more information.


Built with ❤️ for the spatial data community.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

georddprep-0.1.2.tar.gz (8.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

georddprep-0.1.2-py3-none-any.whl (8.2 kB view details)

Uploaded Python 3

File details

Details for the file georddprep-0.1.2.tar.gz.

File metadata

  • Download URL: georddprep-0.1.2.tar.gz
  • Upload date:
  • Size: 8.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.0

File hashes

Hashes for georddprep-0.1.2.tar.gz
Algorithm Hash digest
SHA256 400f296994d52870ab711cdfa18bafa61848baf046975e3b3471801a60d0f688
MD5 89bafbecaee115a69451001e54633ca1
BLAKE2b-256 6944059deaba3330dcc2fa087d9dd626e87a411c5ef7c32f60f25ad6866915a1

See more details on using hashes here.

File details

Details for the file georddprep-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: georddprep-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 8.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.0

File hashes

Hashes for georddprep-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 c2e28ea49d3c87a4bb8e9622cacaddc7ae4e8089fafe58d49c3bd75199cba65e
MD5 0ed0994f3c25771284e585980e2d6826
BLAKE2b-256 f60d44416cd4f5d71062c8e2418f053c59e229e3432ed3b657586df81d7adbd6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page