An easy-to-use Python package that annotate location data with semantic labels from OpenStreetMap
Project description
location_annotation_with_openstreetmap
This is an easy-to-use Python package for annotating location data with OpenStreetMap Point-of-interest tags. It provides a solution for researchers to adding additional layer of context information to location data at large scale in an automatic way. For example, the input can be pairs of coordinates (lat, lon), and the output are types of places of the input locations, such as "gym", "restaurant", "university", "office", etc. Annotation using this package incurs no cost when this package downloads and uses free POI data from Geofabrik that reflect daily changes from OpenStreetMap. For questions about this package, please leave an issue or contact the author.
Two general steps
1. Download and create a geofabrik POI database in local system
2. Annotate location data using the POI database
Contact
- Author: Jixin Li @ mHealth research group
- For questions about this package, please leave an issue.
Citation
- This repository is a part of journal paper submission that is currently under review. Full citation information will come soon. In the meanwhile, please cite as below:
Jixin Li. 2023. location_annotation_with_openstreetmap (osm_annotation). https://bitbucket.org/mhealthresearchgroup/osm_annotation/src/main/
- This package also integrates the geodf and dist functions from the GPS2space package (https://gps2space.readthedocs.io/en/latest/).
Dependencies
- geopandas
- fiona
- gps2space
- tqdm
- Python >= 3.7
Demo
A demo on how to use this package can be found in the jupyter notebook.
Intall package
pip install osm_annotation
Import package
from osm_annotation import geofabrik_database, semantic_annotation
Build a local geofabrik POI database
# database_folder_path is the path to a local folder where you want to build the database. The disk should have at least 150 GB.
geofabrik_database.build(database_folder_path)
Annotate location data
There are three annotation method options:
Method | Description | TIME | Pro | Con |
---|---|---|---|---|
annotate_single_point(lat, lon) | annotate a single point | ~3 hours/point | return distances to all POI types | time-consuming. Method 3 is recommended for batch of points |
annotate_single_shape(lat_list, lon_list) | annotate single shape (e.g., bounding box, polygon) | ~30 min/shape | most accurate method | need a set of points define the query shape |
annotate_batch_points(dataframe, latitude_colname, longitude_colname) | annotate a batch of points (usually centroids of places) | ~3 hours/batch of points | fastest method. Fit for annotating many centroids of places simultaneously. | return the label of the nearest POI and the distance. |
Initialization
semantic_annotator = SemanticAnnotator(database_folder_path)
Example of Method 1
# coordinates of Fenway Park in Boston
centroid_latitude = 42.34653831212525
centroid_longitude = -71.09724395926423
semantic_annotator.annotate_single_point(centroid_latitude, centroid_longitude)
Returned result is a json file with
- matched_labels: semantic label matched with the query point
- min_distance: the distance from the query point to the matched POI, in meters
- distances_to_pois: distance to other types of POIs
{
'matched_labels': 'recreational;outdoor;pitch (polygon)',
'min_distance': 6.169761255410299,
'distances_to_pois': {
'busines;busines;company (polygon)': 326501.593160387,
'busines;busines;convention_center (polygon)': 2682942.101031607,
'busines;busines;factory (polygon)': 3612.955363467255,
'busines;busines;industrial (polygon)': 486.88772114370636,
'busines;busines;office (polygon)': 932.2980449124797,
'commercial;food;bakery (point)': 738.1374807550822,
...
}
Example of Method 2
# bounding box (NW corner,NE corner, SE corner, SW corner) of Museum of Fine Arts in Boston
lat_list = [42.33969558839377, 42.34039653732734, 42.339235761638996, 42.33847311473655]
lon_list = [-71.09563225696323, -71.09348529667446, -71.09270768730832, -71.0948470612041]
semantic_annotator.annotate_single_shape(lat_list, lon_list)
Returned result is a json file with
- matched_labels: semantic labels matched with the query shape
- point_labels: semantic labels of point POI matched with the query shape
- poly_labels: semantic labels of polygon POI matched with the query shape
- matched_geometries: geometries of POIs matched with the query shape
{
'matched_labels':
['commercial;food;cafe (point)',
'commercial;shopping;shop (point)',
'commercial;leisure;museum (polygon)',
'service;transportation;parking (polygon)',
'recreational;outdoor;nature (polygon)'],
'point_labels':
['commercial;food;cafe (point)',
'commercial;shopping;shop (point)'],
'poly_labels':
['commercial;leisure;museum (polygon)',
'recreational;outdoor;nature (polygon)',
'service;transportation;parking (polygon)'],
'matched_geometries':
[<shapely.geometry.point.Point at 0x2b8b25338410>,
<shapely.geometry.polygon.Polygon at 0x2b8b25353b10>,
<shapely.geometry.point.Point at 0x2b8b253415d0>,
<shapely.geometry.polygon.Polygon at 0x2b8b250b0910>,
<shapely.geometry.polygon.Polygon at 0x2b8b25331690>]
}
Example of Method 3
# library, cafe, gym, and train station around the Northeastern University campus
locations = [[42.33833,-71.08795], # library
[42.33909,-71.08758], # cafe
[42.34033,-71.09038], # gym
[42.33661, -71.08944]] # train station
location_dataframe = pd.DataFrame(data = locations, columns = ['latitude', 'longitude'])
semantic_annotator.annotate_batch_points(dataframe = location_dataframe, latitude_colname = 'latitude', longitude_colname = 'longitude')
Returned result is a dataframe with
- matched_labels: semantic labels matched with the query points
- min_distance: the distance from the query point to the matched POI, in meters
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file osm_annotation-0.1.10.tar.gz
.
File metadata
- Download URL: osm_annotation-0.1.10.tar.gz
- Upload date:
- Size: 16.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/34.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.9 tqdm/4.63.1 importlib-metadata/4.11.3 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.5 CPython/3.10.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8fa97d08bfbb7f9b73511f274a3e80a77bfda3f8fc6a6d1820bb3b10ef12c3fc |
|
MD5 | 4b1c6aa9b2256441617213ed418ef8bf |
|
BLAKE2b-256 | b8fa7a7115ba46c6301c05297d1ed6cb1fbdc82a5e70daf0bce699b4f7c1aaf1 |
File details
Details for the file osm_annotation-0.1.10-py3-none-any.whl
.
File metadata
- Download URL: osm_annotation-0.1.10-py3-none-any.whl
- Upload date:
- Size: 16.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/34.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.9 tqdm/4.63.1 importlib-metadata/4.11.3 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.5 CPython/3.10.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3d77e6766d8e590a23cf90638dbe8bdd61b04e6378dc27adb1d3fe30386f8457 |
|
MD5 | 7e33b460355534d10faf7198853b763c |
|
BLAKE2b-256 | 9bba3afcd7c3a49ea760a91ab5cefb5013dae4f0d63a4917e46c1fe761eef33b |