PERDIDO Geoparser python library
Project description
Perdido Geoparser Python library
Installation
To install the latest stable version, you can use:
pip install --upgrade perdido
Quick start
Geoparsing
Import
from perdido.geoparser import Geoparser
Run geoparser
text = "J'ai rendez-vous proche de la place Bellecour, de la place des Célestins, au sud de la fontaine des Jacobins et près du pont Bonaparte."
geoparser = Geoparser(version='Standard')
doc = geoparser(text)
- The
version
parameter can take 2 values: Standard (default), Encyclopedie.
Get tokens
- Access token attributes:
for token in doc:
print(f'{token.text}\tlemma: {token.lemma}\tpos: {token.pos}')
- Get the IOB format:
for token in doc:
print(token.iob_format())
- Get a TSV-IOB format:
for token in doc:
print(token.tsv_format())
Print the XML-TEI output
print(doc.tei)
Print the XML-TEI output with XML syntax highlighting
from display_xml import XML
XML(doc.tei, style='lovelace')
Print the GeoJSON output
print(doc.geojson)
Get the list of named entities
for entity in doc.named_entities:
print(f'entity: {entity.text}\ttag: {entity.tag}')
if entity.tag == 'place':
for t in entity.toponym_candidates:
print(f' latitude: {t.lat}\tlongitude: {t.lng}\tsource {t.source}')
Get the list of nested named entities
for nested_entity in doc.nested_named_entities:
print(f'entity: {nested_entity.text}\ttag: {nested_entity.tag}')
if nested_entity.tag == 'place':
for t in nested_entity.toponym_candidates:
print(f' latitude: {t.lat}\tlongitude: {t.lng}\tsource {t.source}')
Shows named entities and nested named entities using the displacy library from spaCy
displacy.render(doc.to_spacy_doc(), style="ent", jupyter=True)
displacy.render(doc.to_spacy_doc(), style="span", jupyter=True)
Display the map (using folium library)
doc.get_folium_map()
Saving results
doc.to_xml('filename.xml')
doc.to_geojson('filename.geojson')
doc.to_iob('filename.tsv')
doc.to_csv('filename.csv')
Geocoding
Import
from perdido.geocoder import Geocoder
Geocode a single place name
geocoder = Geocoder()
doc = geocoder('Lyon')
Geocode a list of place names
geocoder = Geocoder()
doc = geocoder(['Lyon', 'la place des Célestins', 'la fontaine des Jacobins'])
Get the geojson result
print(doc.geojson)
Get the list of toponym candidates
for t in doc.toponyms:
print(f'lat: {t.lat}\tlng: {t.lng}\tsource {t.source}\tsourceName {t.source_name}')
Get the toponym candidates as a GeoDataframe
print(doc.to_geodataframe())
Perdido Geoparser REST APIs
http://choucas.univ-pau.fr/docs#
Example: call REST API in Python
import requests
url = 'http://choucas.univ-pau.fr/PERDIDO/api/'
service = 'geoparsing'
data = {'content': 'Je visite la ville de Lyon, Annecy et le Mont-Blanc.'}
parameters = {'api_key': 'demo'}
r = requests.post(url+service, params=parameters, json=data)
print(r.text)
Acknowledgements
Perdido
is an active project still under developpement.
This work was partially supported by the following projects:
- GEODE (2020-2024): LabEx ASLAN (ANR-10-LABX-0081)
- GeoDISCO (2019-2020): MSH Lyon St-Etienne (ANR‐16‐IDEX‐0005)
- CHOUCAS (2017-2022): ANR (ANR-16-CE23-0018)
- PERDIDO (2012-2015): CDAPP and IGN
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
perdido-0.1.37.tar.gz
(61.3 MB
view details)
Built Distribution
perdido-0.1.37-py3-none-any.whl
(94.0 MB
view details)
File details
Details for the file perdido-0.1.37.tar.gz
.
File metadata
- Download URL: perdido-0.1.37.tar.gz
- Upload date:
- Size: 61.3 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.13
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | f6c908621221d934af86f5a12e8accafa147081f77b4b3137c393b9e2e2d244a |
|
MD5 | 29e1ba40a5a1f4d6b36e5a1fa00f8d12 |
|
BLAKE2b-256 | 159b2dd9dc41422b23a107881b6a43592817b23fdb2a3331820ba752e9cc02c3 |
File details
Details for the file perdido-0.1.37-py3-none-any.whl
.
File metadata
- Download URL: perdido-0.1.37-py3-none-any.whl
- Upload date:
- Size: 94.0 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.13
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | e5916036668abb585234e1f57795d85d8d762b1919a7c190f566ba69a2d5b73b |
|
MD5 | 49d2c73fcbb5515a64b004cc2096d7ae |
|
BLAKE2b-256 | 75f23aae8188ce8366a76752f69b7ad729372a538cded364963e17e7604f0b03 |