Skip to main content

PERDIDO Geoparser python library

Project description

Perdido Geoparser Python library

PyPI PyPI - License PyPI - Python Version

Installation

To install the latest stable version, you can use:

pip install --upgrade perdido

Quick start

Geoparsing

Binder Open In Colab

Import

from perdido.geoparser import Geoparser

Run geoparser

geoparser = Geoparser(version='Standard')
doc = geoparser('Je visite la ville de Lyon, Annecy et Chamonix.')
  • The version parameter can take 2 values: Standard (default), Encyclopedie.

Get tokens

  • Access token attributes:
for token in doc:
    print(f'{token.text}\tlemma: {token.lemma}\tpos: {token.pos}')
  • Get the IOB format:
for token in doc:
    print(token.iob_format())
  • Get a TSV-IOB format:
for token in doc:
    print(token.tsv_format())

Print the XML-TEI output

print(doc.tei)

Print the GeoJSON output

print(doc.geojson)

Get the list of named entities

for entity in doc.named_entities:
    print(f'entity: {entity.text}\ttag: {entity.tag}')
    if entity.tag == 'place':
        for t in entity.toponym_candidates:
            print(f' latitude: {t.lat}\tlongitude: {t.lng}\tsource {t.source}')

Get the list of nested named entities

for nested_entity in doc.nested_named_entities:
    print(f'entity: {nested_entity.text}\ttag: {nested_entity.tag}')
    if nested_entity.tag == 'place':
        for t in nested_entity.toponym_candidates:
            print(f' latitude: {t.lat}\tlongitude: {t.lng}\tsource {t.source}')

Shows named entities and nested named entities using the displacy library from spaCy

displacy.render(doc.to_spacy_doc(), style="ent", jupyter=True)
displacy.render(doc.to_spacy_doc(), style="span", jupyter=True)

Saving results

doc.to_xml('filename.xml')
doc.to_geojson('filename.geojson')
doc.to_iob('filename.tsv')
doc.to_csv('filename.csv')

Geocoding

Binder Open In Colab

Import

from perdido.geocoder import Geocoder

Geocode a single place name

geocoder = Geocoder()
doc = geocoder('Lyon')

Geocode a list of place names

geocoder = Geocoder()
doc = geocoder(['Lyon', 'Annecy', 'Chamonix'])

Get the geojson result

print(doc.geojson)

Get the list of toponym candidates

for t in doc.toponyms: 
    print(f'lat: {t.lat}\tlng: {t.lng}\tsource {t.source}\tsourceName {t.source_name}')

Get the toponym candidates as a GeoDataframe

print(doc.to_geodataframe())

Perdido Geoparser REST APIs

http://choucas.univ-pau.fr/docs#

Example: call REST API in Python

import requests

url = 'http://choucas.univ-pau.fr/PERDIDO/api/'
service = 'geoparsing'
data = {'content': 'Je visite la ville de Lyon, Annecy et le Mont-Blanc.'}
parameters = {'api_key': 'demo'}

r = requests.post(url+service, params=parameters, json=data)

print(r.text)

Acknowledgements

Perdido is an active project still under developpement.

This work was partially supported by the following projects:

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

perdido-0.1.32.tar.gz (34.7 MB view details)

Uploaded Source

Built Distribution

perdido-0.1.32-py3-none-any.whl (36.0 MB view details)

Uploaded Python 3

File details

Details for the file perdido-0.1.32.tar.gz.

File metadata

  • Download URL: perdido-0.1.32.tar.gz
  • Upload date:
  • Size: 34.7 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.13

File hashes

Hashes for perdido-0.1.32.tar.gz
Algorithm Hash digest
SHA256 2a27e254db734ce99acb0bc68fabcb879545e2f71d6e1d72c85b5a0318123933
MD5 8d8f20ba1b09a2d87e4e2a7216a2eac2
BLAKE2b-256 c38d56c550b98eda27d57dc002bfae25f43cebada3e0bb099f4e33e6c55b8d3e

See more details on using hashes here.

File details

Details for the file perdido-0.1.32-py3-none-any.whl.

File metadata

  • Download URL: perdido-0.1.32-py3-none-any.whl
  • Upload date:
  • Size: 36.0 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.13

File hashes

Hashes for perdido-0.1.32-py3-none-any.whl
Algorithm Hash digest
SHA256 ffeac0b694c9c6716d7ebbbbe60c2877ddeaaf4fb7f083082ab9859812ae9a42
MD5 59dd202b097181d7cf22898331703b8f
BLAKE2b-256 9db6a319f8be46730b99397cc247fb2b2a253802f0bca58231d762840e11f1ee

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page