Skip to main content

PERDIDO Geoparser python library

Project description

Perdido Geoparser Python library

PyPI PyPI - License PyPI - Python Version

http://erig.univ-pau.fr/PERDIDO/

Installation

To install the latest stable version, you can use:

pip install --upgrade perdido

Quick start

Geoparsing

Binder Open In Colab

Import

from perdido.geoparser import Geoparser

Run geoparser

geoparser = Geoparser(lang='fr')
doc = geoparser('Je visite la ville de Lyon, Annecy et Chamonix.')

Get tokens

for token in doc:
    print(f'{token.text}\tlemma: {token.lemma}\tpos: {token.pos}')

Print the XML-TEI output

print(doc.tei)

Print the GeoJSON output

print(doc.geojson)

Get the list of named entities

for entity in doc.ne:
    print(f'entity: {entity.text}\ttag: {entity.tag}')
    if entity.tag == 'place':
        for t in entity.toponyms:
            print(f' latitude: {t.lat}\tlongitude: {t.lng}\tsource {t.source}')

Get the list of nested named entities

for nestedEntity in doc.nne:
    print(f'entity: {nestedEntity.text}\ttag: {nestedEntity.tag}')
    if nestedEntity.tag == 'place':
        for t in nestedEntity.toponyms:
            print(f' latitude: {t.lat}\tlongitude: {t.lng}\tsource {t.source}')

Shows named entities and nested named entities using the displacy library from spaCy

displacy.render(doc.to_spacy_doc(), style="ent")
displacy.render(doc.to_spacy_doc(), style="span")

Geocoding

Binder Open In Colab

Import

from perdido.geocoder import Geocoder

Geocode a single place name

geocoder = Geocoder()
doc = geocoder('Lyon')

Geocode a list of place names

geocoder = Geocoder()
doc = geocoder(['Lyon', 'Annecy', 'Chamonix'])

Get the geojson result

print(doc.geojson)

Get the list of toponym candidates

for t in doc.toponyms: 
    print(f'lat: {t.lat}\tlng: {t.lng}\tsource {t.source}\tsourceName {t.source_name}')

Perdido Geoparser REST APIs

http://choucas.univ-pau.fr/docs#

Example: call REST API in Python

import requests

url = 'http://choucas.univ-pau.fr/PERDIDO/api/'
service = 'geoparsing'
content = 'Je visite la ville de Lyon, Annecy et le Mont-Blanc.'
parameters = {'api_key': 'demo', 'content': content}

r = requests.post(url+service, params=parameters)

print(r.text)

Acknowledgements

Perdido is an active project still under developpement.

This work was partially supported by the following projects:

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

perdido-0.1.11.tar.gz (89.3 kB view details)

Uploaded Source

Built Distribution

perdido-0.1.11-py3-none-any.whl (89.3 kB view details)

Uploaded Python 3

File details

Details for the file perdido-0.1.11.tar.gz.

File metadata

  • Download URL: perdido-0.1.11.tar.gz
  • Upload date:
  • Size: 89.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.10.4

File hashes

Hashes for perdido-0.1.11.tar.gz
Algorithm Hash digest
SHA256 d67d817a9fb58856772a4233dd1d1622f3f3aa726b0b57c3e053852019954d13
MD5 262930598a7c2c36bdfae713a14192c7
BLAKE2b-256 98c455060d5f26bd4daf642f4a18f0bd28b0be8551f2f95cd9a87226382901bc

See more details on using hashes here.

File details

Details for the file perdido-0.1.11-py3-none-any.whl.

File metadata

  • Download URL: perdido-0.1.11-py3-none-any.whl
  • Upload date:
  • Size: 89.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.10.4

File hashes

Hashes for perdido-0.1.11-py3-none-any.whl
Algorithm Hash digest
SHA256 77a5e7a9b5476c0a26ee6adb0d19101c62fe3389974a894c665e93d8accae8a9
MD5 7246f530db761799db3d61723608b5ed
BLAKE2b-256 cc0fae130b099d380b8be7c14dc9477ddeb285a0285a8d95980c3756f09ea54f

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page