Skip to main content

An easy-to-use Python package that annotate location data with semantic labels from OpenStreetMap

Project description

location_annotation_with_openstreetmap

This is an easy-to-use Python package for annotating location data with OpenStreetMap Point-of-interest tags. It provides a solution for researchers to adding additional layer of context information to location data at large scale in an automatic way. For example, the input can be pairs of coordinates (lat, lon), and the output are types of places of the input locations, such as "gym", "restaurant", "university", "office", etc. Annotation using this package incurs no cost when this package downloads and uses free POI data from Geofabrik that reflect daily changes from OpenStreetMap. For questions about this package, please leave an issue or contact the author.

Two general steps

1. Download and create a geofabrik POI database in local system
2. Annotate location data using the POI database

Contact

Citation

  • This repository is a part of journal paper submission that is currently under review. Full citation information will come soon. In the meanwhile, please cite as below:

Jixin Li. 2023. location_annotation_with_openstreetmap (osm_annotation). https://bitbucket.org/mhealthresearchgroup/osm_annotation/src/main/

Dependencies

  • geopandas
  • fiona
  • gps2space
  • tqdm
  • Python >= 3.7

Demo

A demo on how to use this package can be found in the jupyter notebook.

Intall package

pip install osm_annotation

Import package

from osm_annotation import geofabrik_database, semantic_annotation

Build a local geofabrik POI database

# database_folder_path is the path to a local folder where you want to build the database. The disk should have at least 150 GB.
geofabrik_database.build(database_folder_path)

Annotate location data

There are three annotation method options:

Method Description TIME Pro Con
annotate_single_point(lat, lon) annotate a single point ~3 hours/point return distances to all POI types time-consuming. Method 3 is recommended for batch of points
annotate_single_shape(lat_list, lon_list) annotate single shape (e.g., bounding box, polygon) ~30 min/shape most accurate method need a set of points define the query shape
annotate_batch_points(dataframe, latitude_colname, longitude_colname) annotate a batch of points (usually centroids of places) ~3 hours/batch of points fastest method. Fit for annotating many centroids of places simultaneously. return the label of the nearest POI and the distance.

Initialization

semantic_annotator = SemanticAnnotator(database_folder_path)

Example of Method 1

# coordinates of Fenway Park in Boston
centroid_latitude = 42.34653831212525
centroid_longitude = -71.09724395926423
semantic_annotator.annotate_single_point(centroid_latitude, centroid_longitude)

Returned result is a json file with
- matched_labels: semantic label matched with the query point
- min_distance: the distance from the query point to the matched POI, in meters
- distances_to_pois: distance to other types of POIs

   {  
    'matched_labels': 'recreational;outdoor;pitch (polygon)',  
    'min_distance': 6.169761255410299,  
    'distances_to_pois': {  
        'busines;busines;company (polygon)': 326501.593160387,  
        'busines;busines;convention_center (polygon)': 2682942.101031607,  
        'busines;busines;factory (polygon)': 3612.955363467255,  
        'busines;busines;industrial (polygon)': 486.88772114370636,  
        'busines;busines;office (polygon)': 932.2980449124797,  
        'commercial;food;bakery (point)': 738.1374807550822,  
        ...
   }

Example of Method 2

# bounding box (NW corner,NE corner, SE corner, SW corner) of Museum of Fine Arts in Boston
lat_list = [42.33969558839377, 42.34039653732734, 42.339235761638996, 42.33847311473655]
lon_list = [-71.09563225696323, -71.09348529667446, -71.09270768730832, -71.0948470612041]
semantic_annotator.annotate_single_shape(lat_list, lon_list)

Returned result is a json file with

  • matched_labels: semantic labels matched with the query shape
  • point_labels: semantic labels of point POI matched with the query shape
  • poly_labels: semantic labels of polygon POI matched with the query shape
  • matched_geometries: geometries of POIs matched with the query shape
   {
    'matched_labels': 
       ['commercial;food;cafe (point)',
       'commercial;shopping;shop (point)',
       'commercial;leisure;museum (polygon)',
       'service;transportation;parking (polygon)',
       'recreational;outdoor;nature (polygon)'],
    'point_labels': 
       ['commercial;food;cafe (point)',
       'commercial;shopping;shop (point)'],
    'poly_labels': 
       ['commercial;leisure;museum (polygon)',
       'recreational;outdoor;nature (polygon)',
       'service;transportation;parking (polygon)'],
    'matched_geometries': 
        [<shapely.geometry.point.Point at 0x2b8b25338410>,
        <shapely.geometry.polygon.Polygon at 0x2b8b25353b10>,
        <shapely.geometry.point.Point at 0x2b8b253415d0>,
        <shapely.geometry.polygon.Polygon at 0x2b8b250b0910>,
        <shapely.geometry.polygon.Polygon at 0x2b8b25331690>]
   }

Example of Method 3

# library, cafe, gym, and train station around the Northeastern University campus
locations = [[42.33833,-71.08795], # library
             [42.33909,-71.08758], # cafe
             [42.34033,-71.09038], # gym
             [42.33661, -71.08944]] # train station
location_dataframe = pd.DataFrame(data = locations, columns = ['latitude', 'longitude'])
semantic_annotator.annotate_batch_points(dataframe = location_dataframe, latitude_colname = 'latitude', longitude_colname = 'longitude')

Returned result is a dataframe with

  • matched_labels: semantic labels matched with the query points
  • min_distance: the distance from the query point to the matched POI, in meters

Batch_result

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

osm_annotation-0.1.10.tar.gz (16.8 kB view details)

Uploaded Source

Built Distribution

osm_annotation-0.1.10-py3-none-any.whl (16.8 kB view details)

Uploaded Python 3

File details

Details for the file osm_annotation-0.1.10.tar.gz.

File metadata

  • Download URL: osm_annotation-0.1.10.tar.gz
  • Upload date:
  • Size: 16.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/34.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.9 tqdm/4.63.1 importlib-metadata/4.11.3 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.5 CPython/3.10.0

File hashes

Hashes for osm_annotation-0.1.10.tar.gz
Algorithm Hash digest
SHA256 8fa97d08bfbb7f9b73511f274a3e80a77bfda3f8fc6a6d1820bb3b10ef12c3fc
MD5 4b1c6aa9b2256441617213ed418ef8bf
BLAKE2b-256 b8fa7a7115ba46c6301c05297d1ed6cb1fbdc82a5e70daf0bce699b4f7c1aaf1

See more details on using hashes here.

File details

Details for the file osm_annotation-0.1.10-py3-none-any.whl.

File metadata

  • Download URL: osm_annotation-0.1.10-py3-none-any.whl
  • Upload date:
  • Size: 16.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/34.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.9 tqdm/4.63.1 importlib-metadata/4.11.3 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.5 CPython/3.10.0

File hashes

Hashes for osm_annotation-0.1.10-py3-none-any.whl
Algorithm Hash digest
SHA256 3d77e6766d8e590a23cf90638dbe8bdd61b04e6378dc27adb1d3fe30386f8457
MD5 7e33b460355534d10faf7198853b763c
BLAKE2b-256 9bba3afcd7c3a49ea760a91ab5cefb5013dae4f0d63a4917e46c1fe761eef33b

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page