stedsans is a package capable of doing geospatial analyses from text.
Project description
stedsans
stedsans
is a Danish and English geoparsing toolkit utilizing Transformer-based models (Vaswani et al, 2017), including the Danish Ælæctra and the English BERT fine-tuned for Named Entity Recognition, to allow for an efficient and intuitive geospatial analyses of text.
Demonstration of stedsans
For a demonstration of the current tools, we heavily suggest you to use the Google Colab notebook:
Installation
stedsans
is developed and tested on Python 3.7+.
Windows
Since stedsans
requires the package geopandas
, and the dependencies can cause several issues when using Windows, it is recommended to first install Anaconda, which comes with pre-built binaries for geopandas
.
After having install Anaconda, create a conda environment and then install stedsans
using pip
and geopandas
using conda
.
pip install stedsans
conda install geopandas==0.9.0
MacOS and Linux
For MacOS and Linus stedsans
and geopandas
can be installed directly by using pip
.
pip install stedsans
pip install geopandas==0.9.0
Usage
To use stedsans
start by importing stedsans
.
from stedsans import stedsans
We can then initialize a stedsans
instance on a pre-defined example text, and print the extracted entities.
>>> example_text = "Hello my name is Malte and i live in Aarhus C. I love watching Randers FC, a football team from Randers, beat Brøndby IF, a football team from the devils island of Sjælland."
>>> my_stedsans = stedsans(example_text, language="english")
>>> my_stedsans.print_entities()
[ ('Aarhus C', 'LOC'),
('Randers FC', 'ORG'),
('Randers', 'LOC'),
('Brøndby IF', 'ORG'),
('Sjælland', 'LOC')]
These locations can then be vizualised using the plot_coordinates()
function, and we can specify it to limit the locations to only coming from Denmark. This results in an interactive map.
>>> my_stedsans.plot_locations(limit="country", limit_area="Denmark")
You can also plot an interactive heatmap with plot_heatmap()
.
>>> my_stedsans.plot_heatmap(limit="country", limit_area="Denmark")
stedsans
provides lots of other features and tools and these are very thoroughly demonstrated in the Google Colab (if it wasn't obvious already - we really want you to use the colab:grinning:):
(if it wasn't obvious already - we really want you to use the colab :grinning:)
References
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., & Polosukhin, I. (2017). Attention is All you Need. In I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, & R. Garnett (Eds.), Advances in Neural Information Processing Systems 30 (pp. 5998–6008). Curran Associates, Inc. http://papers.nips.cc/paper/7181-attention-is-all-you-need.pdf
Contact
For help or further information feel free to connect with either of the main developers:
Malte Højmark-Bertelsen
hjb@kmd.dk
Jakob Grøhn Damgaard
bokajgd@gmail.com
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for stedsans-0.0.13a0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e674b6fa0128fa5b6d3aa3541dca8fb44b4d9611b2ef3427589a73445254c646 |
|
MD5 | 7a0f5495acf143bf8ca6892d575d9c22 |
|
BLAKE2b-256 | 06cb8a9c068273251676b84f0c97828d37ca498e545a3c406b8469a36256fc62 |