Skip to main content

A Neuro-net ToPonym Recognition model

Project description

NeuroTPR

Overall description

NeuroTPR is a toponym recognition model designed for extracting locations from social media messages. It is based on a general Bidirectional Long Short-Term Memory network (BiLSTM) with a number of additional features, such as double layers of character embeddings, GloVe word embeddings, and contextualized word embeddings ELMo.

The goal of this model is to improve the accuracy of toponym recognition from social media messages that have various language irregularities, such as informal sentence structures, inconsistent upper and lower cases (e.g., “there is a HUGE fire near camino and springbrook rd”), name abbreviations (e.g., “bsu” for “Boise State University”), and misspellings. We tested NeuroTPR in the application context of disaster response based on a dataset of tweets from Hurricane Harvey in 2017.

More details can be found in our paper: Wang, J., Hu, Y., & Joseph, K. (2020): NeuroTPR: A Neuro-net ToPonym Recognition model for extracting locations from social media messages. Transactions in GIS, 24(3), 719-735.


Figure 1. The overall architecture of NeuroTPR

Use the pretrained NeuroTPR model

Using the pretrained NeuroTPR model for toponym recognition will need the following steps:

  1. Setup the virtual environment: Please create a new virtual environment using Anaconda and install the dependent packages using the following commands (please run them in the same order below):
   conda create -n NeuroTPR python=3.6
   conda activate NeuroTPR
   conda install keras -c conda-forge
   pip install git+https://www.github.com/keras-team/keras-contrib.git
   pip install neurotpr
  1. Download the pretrained model, and unzip it to a folder that you would prefer.

  2. Use NeuroTPR to recognize toponyms from text. A snippet of example code is below:

from neurotpr import geoparse

geoparse.load_model("the folder path of the pretrained model; note that the path should end with /")
result = geoparse.topo_recog("Buffalo is a city in New York State.")
print(result)

The input of the "topo_recog" function is a string, and the output is a list of JSON objects containing the recognized toponyms and their start and end indexes in the input string.

Combine NeuroTPR with a geolocation service

NeuroTPR is a toponym recognition model, which means that it will not assign geographic coordinates to the recognized toponyms. If you would like to add coordinates to the recognized toponyms, you could use the geocoding function from GeoPandas, Google Place API, or other services. Note that these services are not doing place name disambiguation for you, since they don't know the contexts under which these toponyms are mentioned. However, it would be fine to use one of these services if the toponyms in your text are not highly ambiguous.

Project dependencies:

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

neurotpr-0.0.8.tar.gz (13.0 kB view hashes)

Uploaded Source

Built Distribution

neurotpr-0.0.8-py3-none-any.whl (27.1 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page