Skip to main content

stedsans is a package capable of doing geospatial analyses from text.

Project description

stedsans

stedsans is a Danish and English geoparsing toolkit utilizing Transformer-based models (Vaswani et al, 2017), including the Danish Ælæctra and the English BERT fine-tuned for Named Entity Recognition, to allow for an efficient and intuitive geospatial analyses of text.

Demonstration of stedsans

For a demonstration of the current tools, we heavily suggest you to use the Google Colab notebook:

Open In Colab

Installation

stedsans is developed and tested on Python 3.7+.

Windows

Since stedsans requires the package geopandas, and the dependencies can cause several issues when using Windows, it is recommended to first install Anaconda, which comes with pre-built binaries for geopandas.

After having install Anaconda, create a conda environment and then install stedsans using pip and geopandas using conda.

pip install stedsans
conda install geopandas==0.9.0

MacOS and Linux

For MacOS and Linus stedsans and geopandas can be installed directly by using pip.

pip install stedsans
pip install geopandas==0.9.0

Usage

To use stedsans start by importing stedsans.

from stedsans import stedsans

We can then initialize a stedsans instance on a pre-defined example text, and print the extracted entities.

>>> example_text = "Hello my name is Malte and i live in Aarhus C. I love watching Randers FC, a football team from Randers, beat Brøndby IF, a football team from the devils island of Sjælland."

>>> my_stedsans = stedsans(example_text, language="english")

>>> my_stedsans.print_entities()

[   ('Aarhus C', 'LOC'),
    ('Randers FC', 'ORG'),
    ('Randers', 'LOC'),
    ('Brøndby IF', 'ORG'),
    ('Sjælland', 'LOC')]

These locations can then be vizualised using the plot_coordinates() function, and we can specify it to limit the locations to only coming from Denmark. This results in an interactive map.

>>> my_stedsans.plot_locations(limit="country", limit_area="Denmark")
DINO illustration

You can also plot an interactive heatmap with plot_heatmap().

>>> my_stedsans.plot_heatmap(limit="country", limit_area="Denmark")
DINO illustration

stedsans provides lots of other features and tools and these are very thoroughly demonstrated in the Google Colab (if it wasn't obvious already - we really want you to use the colab:grinning:):

Open In Colab

(if it wasn't obvious already - we really want you to use the colab :grinning:)

References

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., & Polosukhin, I. (2017). Attention is All you Need. In I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, & R. Garnett (Eds.), Advances in Neural Information Processing Systems 30 (pp. 5998–6008). Curran Associates, Inc. http://papers.nips.cc/paper/7181-attention-is-all-you-need.pdf


Contact

For help or further information feel free to connect with either of the main developers:

Malte Højmark-Bertelsen
hjb@kmd.dk

MalteHB | Twitter MalteHB | LinkedIn



Jakob Grøhn Damgaard
bokajgd@gmail.com

Jakob Grøhn Damgaard | Twitter Jakob Grøhn Damgaard | LinkedIn


Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

stedsans-0.0.13a0.tar.gz (23.2 MB view hashes)

Uploaded Source

Built Distribution

stedsans-0.0.13a0-py3-none-any.whl (23.3 MB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page