Preparing data for regression discontinuity design
Project description
🌍 geoRDDprep
Streamline your Geographical Regression Discontinuity Design (GeoRDD) workflow.
geoRDDprep is a powerful Python toolkit designed to take the pain out of spatial data preparation. Whether you are an economist, political scientist, or data analyst, this package helps you assign points to boundaries, clean up messy polygons, and implement rigorous spatial algorithms with ease.
🚀 Why geoRDDprep?
- ⚡️ Fast & Efficient: Optimized spatial joins and geometric operations using
geopandasandshapely. - 📐 Turner Algorithm Ready: Out-of-the-box implementation of the orthogonal distance criteria from Turner et al. (2014).
- 🧹 Data Cleaning: Automatically remove sliver polygons and tiny, noisy line segments that mess up your analysis.
- 🛠️ Easy Integration: Works seamlessly with your existing
pandasandgeopandasworkflows.
📦 Installation
Install directly from PyPI:
pip install geoRDDprep
🛠️ Usage Examples
1. Assign Addresses to Districts
Map millions of points to their administrative regions in seconds.
import geopandas as gpd
from geoRDDprep import points_in_polygon
# Load your data
points = gpd.read_file("addresses.geojson")
districts = gpd.read_file("school_districts.geojson")
# 🪄 Magic happens here
result = points_in_polygon(points, districts, suffix_name="_district")
print(result.head())
2. The Turner Algorithm (2014)
Assign points to boundaries only if they are within a strict orthogonal distance—crucial for valid RDD analysis.
from geoRDDprep import poly_to_line, drop_tiny_lines, turner
# 1. Convert polygons to boundary lines
lines = poly_to_line(districts)
# 2. Clean up noise (remove lines < 500m)
clean_lines = drop_tiny_lines(lines, method='length', meters=500)
# 3. Match points to boundaries (within 15m)
matched_data = turner(points, clean_lines, orth_distance=15)
# Check which points passed the test
print(matched_data['turner_pass'].value_counts())
3. Clean Messy Polygons
Got "slivers" or gaps in your map? Fix them automatically.
from geoRDDprep import remove_sliver
# Merge slivers into their largest neighbors
clean_polygons = remove_sliver(messy_polygons, boundary_clip)
🤝 Contributing
We love contributions!
- Fork the repo.
- Create a branch (
git checkout -b feature/AmazingFeature). - Commit your changes (
git commit -m 'Add AmazingFeature'). - Push to the branch (
git push origin feature/AmazingFeature). - Open a Pull Request.
📄 License
Distributed under the MIT License. See LICENSE for more information.
Built with ❤️ for the spatial data community.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file georddprep-0.1.2.tar.gz.
File metadata
- Download URL: georddprep-0.1.2.tar.gz
- Upload date:
- Size: 8.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
400f296994d52870ab711cdfa18bafa61848baf046975e3b3471801a60d0f688
|
|
| MD5 |
89bafbecaee115a69451001e54633ca1
|
|
| BLAKE2b-256 |
6944059deaba3330dcc2fa087d9dd626e87a411c5ef7c32f60f25ad6866915a1
|
File details
Details for the file georddprep-0.1.2-py3-none-any.whl.
File metadata
- Download URL: georddprep-0.1.2-py3-none-any.whl
- Upload date:
- Size: 8.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c2e28ea49d3c87a4bb8e9622cacaddc7ae4e8089fafe58d49c3bd75199cba65e
|
|
| MD5 |
0ed0994f3c25771284e585980e2d6826
|
|
| BLAKE2b-256 |
f60d44416cd4f5d71062c8e2418f053c59e229e3432ed3b657586df81d7adbd6
|