An easy-to-use Python package that annotate location data with semantic labels from OpenStreetMap
Project description
location_annotation_with_openstreetmap
Easy-to-use python package for annotating location data with OpenStreetMap Point-of-interest tags
The SemanticAnnotator class aims to annotate location data using geofabrik database created by geofabrik_database.py.
There are three annotation methods:
1. annotate_single_point(lat, lon): annotate single point with semantic labels from OpenStreetMap database.
- pro: return distances to all POI types
- con: time-consuming (~3 hours/point). Method 3 is recommended for batch of points.
2. annotate_single_shape(lat_list, lon_list): annotate single shape (e.g., bounding box, polygon) with semantic labels from OpenStreetMap database.
- pro: most accurate method (~30 min/shape)
- con: need a set of points define the query shape
3. annotate_batch_points(dataframe, latitude_colname, longitude_colname): annotate a batch of points (usually centroids of places) with semantic labels from OpenStreetMap database.
- pro: fastest method. Fit for annotating many centroids of places simultaneously.
- con: just return the label of the nearest POI and the distance.
This package integrates the geodf and dist functions from the GPS2space package (https://gps2space.readthedocs.io/en/latest/).
Dependencies
- geopandas
- fiona
- gps2space
- tqdm
- Python >= 3.7
Intall package
pip install osm_annotation
Import package
from osm_annotation import geofabrik_database, semantic_annotation
Build a local geofabrik POI database
# database_folder_path is the path to a local folder where you want to build the database. The disk should have at least 150 GB.
geofabrik_database.build(database_folder_path)
Annotate location data
Initialization
semantic_annotator = SemanticAnnotator(database_folder_path)
Example of Method 1
# coordinates of Fenway Park in Boston
centroid_latitude = 42.34653831212525
centroid_longitude = -71.09724395926423
semantic_annotator.annotate_single_point(centroid_latitude, centroid_longitude)
Returned result is a json file with
- matched_labels: semantic label matched with the query point
- min_distance: the distance from the query point to the matched POI, in meters
- distances_to_pois: distance to other types of POIs
{'matched_labels': 'recreational;outdoor;pitch (polygon)', 'min_distance': 6.169761255410299, 'distances_to_pois': {'busines;busines;company (polygon)': 326501.593160387, 'busines;busines;convention_center (polygon)': 2682942.101031607, 'busines;busines;factory (polygon)': 3612.955363467255, 'busines;busines;industrial (polygon)': 486.88772114370636, 'busines;busines;office (polygon)': 932.2980449124797, 'commercial;food;bakery (point)': 738.1374807550822, ... }
Example of Method 2
# bounding box (NW corner,NE corner, SE corner, SW corner) of Museum of Fine Arts in Boston
lat_list = [42.33969558839377, 42.34039653732734, 42.339235761638996, 42.33847311473655]
lon_list = [-71.09563225696323, -71.09348529667446, -71.09270768730832, -71.0948470612041]
semantic_annotator.annotate_single_shape(lat_list, lon_list)
Returned result is a json file with
- matched_labels: semantic labels matched with the query shape
- point_labels: semantic labels of point POI matched with the query shape
- poly_labels: semantic labels of polygon POI matched with the query shape
- matched_geometries: geometries of POIs matched with the query shape
{'matched_labels': ['commercial;food;cafe (point)', 'commercial;shopping;shop (point)', 'commercial;leisure;museum (polygon)', 'service;transportation;parking (polygon)', 'recreational;outdoor;nature (polygon)'], 'point_labels': ['commercial;food;cafe (point)', 'commercial;shopping;shop (point)'], 'poly_labels': ['commercial;leisure;museum (polygon)', 'recreational;outdoor;nature (polygon)', 'service;transportation;parking (polygon)'], 'matched_geometries': [<shapely.geometry.point.Point at 0x2b8b25338410>, <shapely.geometry.polygon.Polygon at 0x2b8b25353b10>, <shapely.geometry.point.Point at 0x2b8b253415d0>, <shapely.geometry.polygon.Polygon at 0x2b8b250b0910>, <shapely.geometry.polygon.Polygon at 0x2b8b25331690>]}
Example of Method 3
# library, cafe, gym, and train station around the Northeastern University campus
locations = [[42.33833,-71.08795], # library
[42.33909,-71.08758], # cafe
[42.34033,-71.09038], # gym
[42.33661, -71.08944]] # train station
location_dataframe = pd.DataFrame(data = locations, columns = ['latitude', 'longitude'])
semantic_annotator.annotate_batch_points(dataframe = location_dataframe, latitude_colname = 'latitude', longitude_colname = 'longitude')
Returned result is a dataframe with
- matched_labels: semantic labels matched with the query points
- min_distance: the distance from the query point to the matched POI, in meters
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for osm_annotation-0.1.3-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0691cf43f20d425383c6bdd785224f78311dbc23ac60a94e92684781fb911f55 |
|
MD5 | 5b7f34f2e596df42406ae397d74e8b96 |
|
BLAKE2b-256 | afdc2e20c303c6356f78cd327f103aa1ba08b568a23d8f9bc1b08e073543c877 |