Skip to main content

PERDIDO Geoparser python library

Project description

Perdido Geoparser Python library

PyPI PyPI - License PyPI - Python Version

http://erig.univ-pau.fr/PERDIDO/

Installation

To install the latest stable version, you can use:

pip install --upgrade perdido

Quick start

Geoparsing

Binder Open In Colab

Import

from perdido.geoparser import Geoparser

Run geoparser

geoparser = Geoparser(lang='fr')
doc = geoparser('Je visite la ville de Lyon, Annecy et Chamonix.')

Get tokens

for token in doc:
    print(f'{token.text}\tlemma: {token.lemma}\tpos: {token.pos}')

Print the XML-TEI output

print(doc.tei)

Print the GeoJSON output

print(doc.geojson)

Get the list of named entities

for entity in doc.named_entities:
    print(f'entity: {entity.text}\ttag: {entity.tag}')
    if entity.tag == 'place':
        for t in entity.toponyms:
            print(f' latitude: {t.lat}\tlongitude: {t.lng}\tsource {t.source}')

Get the list of nested named entities

for nestedEntity in doc.nested_named_entities:
    print(f'entity: {nestedEntity.text}\ttag: {nestedEntity.tag}')
    if nestedEntity.tag == 'place':
        for t in nestedEntity.toponyms:
            print(f' latitude: {t.lat}\tlongitude: {t.lng}\tsource {t.source}')

Shows named entities and nested named entities using the displacy library from spaCy

displacy.render(doc.to_spacy_doc(), style="ent", jupyter=True)
displacy.render(doc.to_spacy_doc(), style="span", jupyter=True)

Saving results

doc.to_xml('filename.xml')
doc.to_geojson('filename.geojson')
doc.to_csv('filename.csv')

Geocoding

Binder Open In Colab

Import

from perdido.geocoder import Geocoder

Geocode a single place name

geocoder = Geocoder()
doc = geocoder('Lyon')

Geocode a list of place names

geocoder = Geocoder()
doc = geocoder(['Lyon', 'Annecy', 'Chamonix'])

Get the geojson result

print(doc.geojson)

Get the list of toponym candidates

for t in doc.toponyms: 
    print(f'lat: {t.lat}\tlng: {t.lng}\tsource {t.source}\tsourceName {t.source_name}')

Perdido Geoparser REST APIs

http://choucas.univ-pau.fr/docs#

Example: call REST API in Python

import requests

url = 'http://choucas.univ-pau.fr/PERDIDO/api/'
service = 'geoparsing'
data = {'content': 'Je visite la ville de Lyon, Annecy et le Mont-Blanc.'}
parameters = {'api_key': 'demo'}

r = requests.post(url+service, params=parameters, json=data)

print(r.text)

Acknowledgements

Perdido is an active project still under developpement.

This work was partially supported by the following projects:

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

perdido-0.1.19.tar.gz (38.9 MB view details)

Uploaded Source

Built Distribution

perdido-0.1.19-py3-none-any.whl (40.1 MB view details)

Uploaded Python 3

File details

Details for the file perdido-0.1.19.tar.gz.

File metadata

  • Download URL: perdido-0.1.19.tar.gz
  • Upload date:
  • Size: 38.9 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.13

File hashes

Hashes for perdido-0.1.19.tar.gz
Algorithm Hash digest
SHA256 a937b3b3df767e731c65f38aa0698448da69f517b163a44f4d1b0a8339822739
MD5 5bc55e3a1f940d7b4c4910136b104f50
BLAKE2b-256 e63cff896ca29e091dff916fd5096ec30ab6b4c7f1d73b09f8f2b9477b2ab936

See more details on using hashes here.

File details

Details for the file perdido-0.1.19-py3-none-any.whl.

File metadata

  • Download URL: perdido-0.1.19-py3-none-any.whl
  • Upload date:
  • Size: 40.1 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.13

File hashes

Hashes for perdido-0.1.19-py3-none-any.whl
Algorithm Hash digest
SHA256 c8769b39e64a6a9e4bace5de9b0b2573d4f9837b25d692b082ec97fb5ba5231c
MD5 a637f3f103fab943f00d8f5ea5a693cf
BLAKE2b-256 0a9e47cf020862ee89441d1e7992ffe3c16b6503747099e38eb7accd8e531562

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page