Skip to main content

Easy-to-use module for streamlined parsing of countries from locations

Project description

easy-geoparsing

Easy-to-use module for streamlined parsing of countries from plaintext locations and top-level domains, plus manipulation of country names and ISO 2 & 3 character codes.

Implementation relies on:

installation

To install from the command line via pip, do:

pip install easy-geoparsing

To upgrade to the latest version via pip do:

pip install easy-geoparsing --upgrade

To use via pipenv put the following in your Pipfile:

[packages]
easy-geoparsing = ">=1.0.0"

development

If you've cloned the repository, the best way to make it work is using pipenv

If you don't yet have pipenv, you can use pip to install it from the command line:

pip install pipenv --upgrade

Then, in the top level directory of this repository, easy-geoparsing, do:

pipenv install --dev

This will create the virtual environment and install the requirements (viewable in the Pipfile). The --dev flag will install packages needed for testing etc.

usage

GETTING STARTED

Do the following to get the parser utilities, noting that creating an instance of EasyCountryParser will automatically download the country data payload from RESTcountries and set up all the resources. Speed will therefore depend on your internet connection, but the payload is not large.

from easy_geoparsing import EasyCountryParser

ez_parser = EasyCountryParser()

or, if you don't want to use our alternative names for some of the countries (i.e. you want to exactly follow the RESTcountries standard)

ez_parser = EasyCountryParser(altnames=False)

The EasyCountryParser class provides utilities, based on the data from the RESTcountries API and the GeoText natural-language parser library, for easily extracting and handling country names and codes.

PARSER RESOURCES

The parser is initialised with the following resources:

  • .data - pandas DataFrame containing RESTcountries data
  • .tld_to_a2c - python dict, maps TLDs to 2-character ISO codes
  • .tld_to_a3c - python dict, maps TLDs to 3-character ISO codes
  • .iso2to3 - python dict, maps 2-character ISO codes to 3
  • .iso3to2 - python dict, maps 3-character ISO codes to 2
  • .a2c_map - python dict, maps 2-char ISO codes to full names
  • .a3c_map - python dict, maps 3-char ISO codes to full names

PARSER METHODS

The parser has the following methods for handling locations data:

  • .retrieve_country - parses plaintext for extractable 2-character ISO codes for countries (which can then be manipulated using the mappers above)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

easy-geoparsing-1.2.0.tar.gz (5.2 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page