PERDIDO Geoparser python library
Project description
Perdido Geoparser Python library
http://erig.univ-pau.fr/PERDIDO/
Installation
To install the latest stable version, you can use:
pip install --upgrade perdido
Quick start
Geoparsing
Import
from perdido.geoparser import Geoparser
Run geoparser
geoparser = Geoparser(lang='fr')
doc = geoparser('Je visite la ville de Lyon, Annecy et Chamonix.')
Get tokens
for token in doc:
print(f'{token.text}\tlemma: {token.lemma}\tpos: {token.pos}')
Print the XML-TEI output
print(doc.tei)
Print the GeoJSON output
print(doc.geojson)
Get the list of named entities
for entity in doc.ne:
print(f'entity: {entity.text}\ttag: {entity.tag}')
if entity.tag == 'place':
for t in entity.toponyms:
print(f' latitude: {t.lat}\tlongitude: {t.lng}\tsource {t.source}')
Get the list of nested named entities
for nestedEntity in doc.nne:
print(f'entity: {nestedEntity.text}\ttag: {nestedEntity.tag}')
if nestedEntity.tag == 'place':
for t in nestedEntity.toponyms:
print(f' latitude: {t.lat}\tlongitude: {t.lng}\tsource {t.source}')
Shows named entities and nested named entities using the displacy library from spaCy
displacy.render(doc.to_spacy_doc(), style="ent")
displacy.render(doc.to_spacy_doc(), style="span")
Geocoding
Import
from perdido.geocoder import Geocoder
Geocode a single place name
geocoder = Geocoder()
doc = geocoder('Lyon')
Geocode a list of place names
geocoder = Geocoder()
doc = geocoder(['Lyon', 'Annecy', 'Chamonix'])
Get the geojson result
print(doc.geojson)
Get the list of toponym candidates
for t in doc.toponyms:
print(f'lat: {t.lat}\tlng: {t.lng}\tsource {t.source}\tsourceName {t.source_name}')
Perdido Geoparser REST APIs
http://choucas.univ-pau.fr/docs#
Example: call REST API in Python
import requests
url = 'http://choucas.univ-pau.fr/PERDIDO/api/'
service = 'geoparsing'
content = 'Je visite la ville de Lyon, Annecy et le Mont-Blanc.'
parameters = {'api_key': 'demo', 'content': content}
r = requests.post(url+service, params=parameters)
print(r.text)
Acknowledgements
Perdido
is an active project still under developpement.
This work was partially supported by the following projects:
- GEODE (2020-2024): LabEx ASLAN (ANR-10-LABX-0081)
- GeoDISCO (2019-2020): MSH Lyon St-Etienne (ANR‐16‐IDEX‐0005)
- CHOUCAS (2017-2022): ANR (ANR-16-CE23-0018)
- PERDIDO (2012-2015): CDAPP and IGN
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
perdido-0.1.11.tar.gz
(89.3 kB
view details)
Built Distribution
perdido-0.1.11-py3-none-any.whl
(89.3 kB
view details)
File details
Details for the file perdido-0.1.11.tar.gz
.
File metadata
- Download URL: perdido-0.1.11.tar.gz
- Upload date:
- Size: 89.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.10.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | d67d817a9fb58856772a4233dd1d1622f3f3aa726b0b57c3e053852019954d13 |
|
MD5 | 262930598a7c2c36bdfae713a14192c7 |
|
BLAKE2b-256 | 98c455060d5f26bd4daf642f4a18f0bd28b0be8551f2f95cd9a87226382901bc |
File details
Details for the file perdido-0.1.11-py3-none-any.whl
.
File metadata
- Download URL: perdido-0.1.11-py3-none-any.whl
- Upload date:
- Size: 89.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.10.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 77a5e7a9b5476c0a26ee6adb0d19101c62fe3389974a894c665e93d8accae8a9 |
|
MD5 | 7246f530db761799db3d61723608b5ed |
|
BLAKE2b-256 | cc0fae130b099d380b8be7c14dc9477ddeb285a0285a8d95980c3756f09ea54f |