Skip to main content

stedsans is a package capable of doing geospatial analyses from text.

Project description

stedsans

This repository is for an exam project for the course Spatial Analytics at Aarhus University during the spring of 2021.

It is made by Jakob Grøhn Damgaard and Malte Højmark-Bertelsen.

The purpose of it is to build a PyPI-package capable of plotting a map of any location in a Danish sentence. To do so we employ the Natural Language Processing (NLP) technique Named Entity Recognition (NER)

NER is a task consisting of finding words in text that constitute a specific entities and tagging them with specific labels. The most common entities are person names (PER), locations (LOC) and organizations (ORG) (Ruder, 2019). The way the named entities are tagged follows a tagging scheme called BIO-tagging, where the different words are separated as either being the beginning (B) of an entity, inside an entity (I), or other (O), meaning that a word is not part of the defined entities. An illustration of the aforementioned entities can be seen in Table 1.

Table 1:

NER-tag Meaning
B-PER Beginning of person name
I-PER Inside a person name
B-LOC Beginning of location
I-LOC Inside a location
B-ORG Beginning of organization
I-ORG Inside an organization
O Other

Instructions

To use the code locally, start off by cloning the repository and install Anaconda for your OS. Afterwards create a conda environment and install the requirements.

# From the directory of this repository
conda create -n [env_name] python=3.9  # Create conda environment
conda activate [env_name]  # Activate conda environment
pip install -r requirements.txt  # Install required packages

Afterwards install geopandasusing the pre-build binaries from Anaconda:

conda install geopandas

Usage

To see an example of usage see the Google Colab demo: Open In Colab

References

Ruder, S. (2019). Neural transfer learning for natural language processing (Doctoral dissertation, NUI Galway).


Contact

For help or further information feel free to reach out to Jakob Grøhn Damgaard on bokajgd@gmail.com or Malte Højmark-Bertelsen on hjb@kmd.dk.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

stedsans-0.0.10.tar.gz (23.2 MB view hashes)

Uploaded Source

Built Distribution

stedsans-0.0.10-py3-none-any.whl (23.3 MB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page